2013 IEEE 54th Annual Symposium on Foundations of Computer Science

Faster Canonical Forms For Strongly Regular Graphs (Extended Abstract)

Laszl´ o´ Babai∗, Xi Chen†, Xiaorui Sun†, Shang-Hua Teng‡, and John Wilmes∗ ∗University of Chicago Email: [email protected] and [email protected] †Columbia University Email: [email protected] and [email protected] ‡University of Southern California Email: [email protected]

Abstract—We show that a canonical form for strongly I. INTRODUCTION regular (s. r.) graphs can be found in time exp(O(n1/5)) and therefore isomorphism of s. r. graphs can be tested A. History and statement of the main result within the same time bound, where n is the number of It follows from the early theory of interactive proofs vertices and the tilde hides a polylogarithmic factor. that the Graph Isomorphism problem (GI) is not NP- The best previous bound for testing isomorphism of s. r. complete unless the polynomial-time hierarchy col- exp(O(n1/3)) graphs was (Spielman, STOC 1996) while lapses [1] (see [2] for a self-contained proof and ref- the bound for GI in general has been standing firmly at exp(O(n1/2)) for three decades. (These results, too, erences). On the other hand, for three decades now, the best known upper bound on the complexity of GI has provided canonical forms.)  √ The previous bounds on isomorphism of s. r. graphs been exp(O( n)) where n is the number of vertices (Babai 1980 and Spielman 1996) were based on the and the tilde hides a polylog factor [3], [4], [5]. analysis of the classical individualization/refinement (I/R) A with parameters (n, k, λ, μ) heuristic. The present bound depends on a combination of is a regular graph with n vertices and k such that a deeper analysis of the I/R heuristic with Luks’s group each pair of adjacent vertices has λ common neighbors theoretic divide-and-conquer methods following Babai- Luks (STOC 1983) and Miller (1983). and each pair of non-adjacent vertices has μ common neighbors. Our analysis builds on Spielman’s work that brought Neumaier’s 1979 classification of s. r. graphs to bear on The class of strongly regular graphs, while not be- the problem. One of Neumaier’s classes, the line-graphs lieved to be GI-complete, has long been identified as a of Steiner 2-designs, has been eliminated as a bottleneck hard case for GI (cf. [6]). in recent work by the present authors (STOC’13). In the A function F from a class F of finite structures to remaining hard cases, we have the benefit of “Neumaier’s itself is called a canonical form if for every X, Y ∈F claw bound” and its asymptotic consequences derived by ∼ ∼ Spielman, some of which we improve via a new “clique we have X = F (X) and X = Y if and only if F (X)= geometry.” F (Y ). Isomorphism of members of F can be decided We also prove, by an analysis of the I/R heuris- by two applications of a canonical form function and tic, that, with known (trivial) exceptions, s. r. graphs comparison of the outputs. have exp(O(n9/37)) automorphisms, improving Spiel- The set Iso(G, H) of G → H isomorphisms is a  1/3 man’s exp(O(n )) bound. coset of the automorphism group Aut(G) and can be No knowledge of group theory is required for this concisely represented by a set of O(n) generators of paper. The group theoretic method is only used through Aut(G) together with a single G → H isomorphism. an easily stated combinatorial consequence (Babai–Luks, Our main result is the following. 1983 combined with Miller, 1983). While the bulk of this paper is joint work by the five Main Theorem I.1. Let G, H be s. r. graphs. A canon- authors, it also includes two contributions by subsets of the ical form for G can be computed, and consequently, authors: the clique geometry [BW] and the automorphism isomorphism of G and H can be decided, in time bound [CST]. exp(O(n1/5)). Moreover, the set Iso(G, H) can be Keywords-Algorithms, graph isomorphism, strongly reg- computed within the same time bound. ular graphs Disconnected s. r. graphs are disjoint unions of

0272-5428/13 $26.00 © 2013 IEEE 157 DOI 10.1109/FOCS.2013.25 cliques of equal size; we refer to them and to their The line-graph of a finite geometry (P, L) has L for its complements as the “trivial s. r. graphs.” Nontrivial set of vertices; adjacency corresponds to intersection. s. r. graphs√ have diameter 2 and therefore have degree Two classes of finite geometries are of particular k ≥ n − 1. Since the complement of a s. r. graph is interest. In a Steiner 2-design there is a line through s. r., we shall assume throughout the paper that every pair of points. The points of a transversal design √ − ≤ ≤ − are partitioned into  classes of equal size; there is a n 1 k (n 1)/2. (1) line through a pair of points if and only if the points do Babai showed in 1980 [7] (cf. [8]) that isomorphism not belong to the same class. of s. r. graphs can be tested in time The line graphs of these two classes of geometries  are s. r.; we refer to them and to their complements as exp(O(n/k)). (2) geometric s. r. graphs. In the light√ of inequality (1) this gives an overall bound A conference graph is a s. r. graph of degree k = of exp(O( n)). This was improved by Spielman [9] (n − 1)/2 with μ =(n − 1)/4 and λ = μ − 1. (STOC 1996) to exp(O(n1/3)); no further improvement was obtained until the present paper. D. The Neumaier classification B. Graphic s. r. graphs and vertex coloring The following result was central to Spielman’s work and remains central to ours. Let G =(V,E) be an (undirected) graph; V is the set of vertices and E the set of edges. The line-graph Theorem I.2 (Neumaier, 1979). A s. r. graph is trivial, L(G) has the vertex set E with two edges of G being graphic, geometric, or a conference graph, or satisfies adjacent in L(G) if they share a vertex in G. “Neumaier’s claw bound.” L(G) is a nontrivial s. r. graph precisely if G = Kv v v,v 2 ≥ Neumaier’s claw bound [11], [12] is an inequality (n = 2 )orK (n = v )(v 3); we refer to these graphs and their complements as the “graphic involving the parameter μ and the eigenvalues of the s. r. graphs.” (see e. g. [9]). We shall only need the We shall work with vertex-colored graphs (colors following asymptotic consequences of the claw bound. are preserved by isomorphisms by definition). While Theorem I.3. Suppose G is a s. r. graph satisfying canonical forms for the graphic s. r. graphs can easily be Neumaier’s claw bound. Assum k = o(n). Then found in linear time, the vertex-colored versions of each (a) μ ∼ k2/n (Spielman) of the two classes of graphic s. r. graphs are GI-complete 2/3 1/3 4/3 −1/3  / − (b) λ = O(k μ )=O(k n ) (Spielman) via a quadratic reduction. So an exp(O(n1 4 )) GI (c) If k =Ω(n2/3) then λ = O((kμ)1/2)= bound for either of these colored classes would improve 3/2 −1/2 / − O(k n ) ([13], see Section V) the general bound for GI to exp(O(n1 2 2 )) and would therefore in itself be a major result. However, our main Parts (a) and (b) are implicit in [9]. results remain valid for the vertex-colored versions of We point out the philosophical significance of these all s. r. graphs except the graphic ones. results. Written as μ/n ∼ (k/n)2, item (a) can be inter- This comment also underlines the significance of preted as saying that the neighborhoods of nonadjacent being able to separate graphic s. r. graphs from the pairs of vertices are “asymptotically independent.” Since remaining ones. A spectral separation was first achieved most pairs of vertices are not adjacent, this is the typical by Seidel [10] who showed that for n ≥ 29, the behavior. While the neighborhoods of adjacent vertices s. r. graphs with least eigenvalue −2 are precisely the can be heavily positively correlated, items (b) and (c) s. r. line graphs. This was powerfully generalized by limit this correlation. We found this intuition helpful. Neumaier to separate additional classes of s. r. graphs In the rest of this section we discuss how we dispose which we call geometric (below); following Spielman, of the cases that do not satisfy Neumaier’s claw bound. we build on Neumaier’s classification. The rest of the paper will assume the claw bound. C. Geometric s. r. graphs and conference graphs The following results appear in Miller [14] (1978) for transversal designs and in [15] and [16] (STOC’13) for For the purposes of this paper, a finite geometry is a Steiner 2-designs. set P of v “points” and set L of subsets of P called “lines.” The “length” of a line is the number of its Theorem I.4. Canonical forms for transversal designs points; we assume (i) all lines have the same length and for Steiner 2-designs can be computed in nO(log n)  ≥ 3; (ii) every pair of lines shares at most one point. time where n is the number of lines.

158 To apply Theorem I.4 to geometric graphs, we first part of our work is based on Luks’s group-theoretic need to reconstruct the underlying geometry from the divide-and-conquer through a simply-stated combinato- line-graph. This is not always possible. However, up rial consequence, Theorem III.1, implicit in Babai–Luks to a degree threshold ∼ n2/3 the geometry can be (1983) [3] combined with a result of Miller (1983) [18]. uniquely reconstructed in polynomial time, as shown by Theorem III.1 reduces the problem of finding canonical Miller [14] for transversal designs and by Spielman [9] forms to a relaxed analysis of I/R, where instead of for Steiner designs. All this is implicit in Neumaier’s targeting a complete split, our goal is a “color-t-bounded work, along with the following. graph” (see the definition before Theorem III.1). Proposition I.5. Let G be a geometric s. r. graph. Then F. Detailed results either the underlying geometry can be uniquely recon- Our main result follows from a combination of the structed in polynomial time, or G satisfies Neumaier’s following results with Theorem I.3. claw bound. Theorem I.6. Let G be a s. r. graph satisfying Neu- The assumption k = o(n) in Theorem I.3 is justified maier’s claw bound. Assume k = o(n). Then a canoni- by (2) [7]. Indeed, for k ≥ n/ log n, we can construct cal form of G can be found in time  canonical forms in quasipolynomial time (exp(O(1))) (a) nO(μ+log n) (Sec. IV) by (2). This observation takes care in particular of (b) exp(O((n/k)1/2)) for k = o(n2/3) (Spielman) conference graphs (a case in Neumaier’s classification). (c) exp(O(1+λ/μ)) for k = ω((n log n)1/2) (Sec. VI) In the remaining cases we shall assume k ≤ n/ log n and that Neumaier’s claw bound holds, and therefore Parts (b) and (c) of this result combined with parts items (a), (b), (c) of Theorem I.3 hold as well. (a) and (c) of Theorem I.3 yield (b) exp(O((n/k)1/2)) for all k. E. The individualization/refinement (I/R) heuristic and Remark I.7. Part (a) is the only part where the proof the automorphism bound uses the group theory method. In all other cases, the A classical heuristic approach to the GI problem is the time bounds stated suffice to list all isomorphisms. (This “individualization/refinement” (I/R) method: we indi- is true for the nO(log n) bound in the geometric cases as vidualize t vertices (assign unique colors to them), then well [15], [16].) apply a canonical color refinement process (cf. Sec. II). Deriving the Main Theorem I.1 from Theorems I.6, If we are lucky enough that the refinement process I.3, Prop. I.5, and Eq. (2): We settle the trivial and the completely splits the graph (assigns a unique color to graphic cases in a straightforward manner in polynomial each vertex after refinement) then we obtain a canonical t+O(1) (in fact, linear) time. The following cases are done in form for the graph in time n by repeating the quasipolynomial time: the cases k ≥ n/ log n by (2); process for all possible choices of the list of t vertices and the geometric cases where we don’t have the claw to be individualized and picking the lexicographically bound, using Theorem I.4 and Proposition I.5. Finally first among the resulting labeled graphs. in the remaining cases we use part (a) of Theorem I.6 The results of Babai [7] and Spielman [9] as well as in combination with part (a) of Theorem I.3 for k ≤ [15] and [16] were based solely on the analysis of the n3/5 noting that μ ∼ k2/n  n1/5 in this range; and I/R heuristic. statement (b) after Thm. I.6 for n3/5 ≤ k ≤ n/ log n. This heuristic, however, has an inherent mathematical The bottleneck timing arises at k ∼ n3/5. limitation: if the individualization of t vertices com- We now state our bounds in terms of k and n, along | |≤ t pletely splits the graph G then Aut(G) n . Babai with the history for comparison. has conjectured for three decades that for nontrivial, non-graphic s. r. graphs | Aut(G)| =exp(no(1)), but the Theorem I.8. Canonical forms for a s. r. graph G can best result to date in this direction is be computed in time   9/37 (a) exp(O(n/k)) for all k [7] | |≤ (3) / Aut(G) exp(O(n )) (b) exp(O( n/k)) for k = o(n2 3) [9] unless G is by [17] (see Section VIII), improving Babai’s geometric O n exp(O(n1/2)) and Spielman’s exp(O(n1/3)) bounds (c) n (log ) if G is geometric and does not satisfy and inspiring the present work. Neumaier’s claw bound ([15], [16] plus Prop. I.5) Our algorithmic result, however, goes beyond the (d) In the present paper we prove exp(O(n9/37)) bound and therefore necessarily in- (d1) exp(O(k2/n)) for k = O(n/ log n), improv- volves more than an intricate analysis of I/R. Indeed, ing Spielman’s bound in the range k ≤ n3/5

159  (d2) exp(O( n/k)) for k = o(n), extending the coloring. (It seems this was first introduced in [19] in range of Spielman’s bound. the context of the Weisfeiler-Leman refinement.) We / / say that depth-d stabilization achieves a certain type We still use (b) for n3 5 ≤ k = o(n2 3) and (a) of coloring if there exists a set S of size |S|≤d such for k =Ω(n) (and (c) for the geometric case). that after individualizing S, the stable refinement is of G. Outline of the paper the stated type. The two basic techniques, I/R and the group theoretic Proposition II.1. Let C be a class of colored graphs, method, are sketched in Sections II and III, resp. Sec- closed under isomorphisms. Let R be a canonical re- tion IV gives the full proof of part (a) of Theorem I.6 finement operator over C. Suppose depth- stabilization via a color-μ-bound (see def. before Thm. III.1). This completely splits G ∈C. Then | Aut(G)|≤n and we is the only part of the paper that depends on the can compute a canonical form of G and list Iso(G, H) group theory method. For lack of space, all the other for any colored graph H in time T (R)n+O(1), where proofs are only sketched. Section V describes the clique T (R) denotes the cost of one execution of R on a geometry and indicates the proof of the improved bound colored graph on n vertices. on λ (part (c) of Theorem I.3). Sections VI and VII The naive color refinement process first replaces outline two proofs of the case of large degree (part (c) of Theorem I.6) based on different strategies. Finally, the coloring f by the coloring f where f (v)=  / ∈ Section VIII outlines the proof of the exp(O(n9 37)) (f(v); mi(v):i [s]) where mi(v) denotes the bound on the number of automorphisms. number of neighbors of v of color i.Nextwelex- icographically order the set {f (v):v ∈ V } of Three follow-up papers will contain the complete  strings, and define f(v) to be the rank of f (v) in this proofs. The bulk of the work will be covered by the   “full version” of this paper; the clique structure will be ordering. (The f notation hides the fact that f depends not only on f but on (V,E,f).) The correspondence presented in [13], and the automorphism bound in [17].  (V,E,f) → (V,E,f) is clearly a canonical refinement; II. INDIVIDUALIZATION AND REFINEMENT this is one round of the naive refinement process. The We consider vertex-colored graphs G =(V,E,f) naive refinement operator replaces f by the stable where f : V →{1,...,s} is a map for some s (the coloring f ∗. number of colors). The coloring f defines a partition Except for Section VII, the only refinement operator of the vertices into color classes. A coloring g of the we consider is the naive refinement. The same is true for graph (V,E) is a refinement of the coloring f of (V,E) papers [7], [9], [15]. Paper [16] and Section VII of the if (∀v, w ∈ V )(g(v)=g(w) ⇒ f(v)=f(w)). present paper use a slight extension of naive refinement, A color-refinement operator R over a class C of col- subsumed by the Weisfeiler-Leman refinement [19]. ored graphs assigns to every member of C another mem- III. THE GROUP THEORY METHOD ber with the same underlying graph and a refined color- ing. We call the refinement operator R canonical if for The proof of part (a) of Theorem I.6 is based on all G, H ∈Cwe have Iso(G, H)=Iso(R(G), R(H)). a combination of Luks’s group theoretic divide-and- The coloring of G is stable if the coloring of R(G) conquer methods [20] and the I/R technique. This defines the same partition as G. Repeated application combination was developed by Babai and Luks in [3], of a (canonical) refinement operator to a colored graph see esp. Section 4.5 of that paper. Below we state the G =(V,E,f) leads to a (canonical) stable refinement general principle implicit in [3]. ∗ ∗ Let G be a colored graph with color classes G =(V,E,f ). k We say that a vertex v has a unique color if no other C1,...,Cm (in this order). Let Bk = i=1 Ci.Wesay vertex shares the color of v. that G is color-t-bounded if for k =1,...,m,noset We say that a color refinement operator completely of t +1 vertices in Ck has the same set of neighbors in ∅ splits the colored graph G =(V,E,f) if the stable Bk−1. (Note that B0 = , so for k =1this condition ∗ | |≤ refinement G of G assigns a unique color to each means C1 t.) R vertex. We note that if this is the case and is canonical Theorem III.1. Let G be a color-t-bounded colored then | Aut(G)| =1and sorting the vertices of G in the ∗ graph. Then a canonical form for G can be computed order of their f -values puts G into a canonical form. in nO(t) time. Individualizing a vertex v means the assignment of a unique color to v. Depth-d stabilization is the process A weaker form of this result is implicit in [3]; this of individualizing d vertices and refining to a stable weaker form would suffice for our purposes. We stated

160 the result in this stronger form for its simplicity. To We say a set A is closed if A = A. The closure A of A prove this stronger form, we need to adapt the proof is the smallest closed set containing A. We say a set A of [18, Theorem 2] (G. L. Miller, 1983) to the [3] generates a set B if B ⊆ A. We show that there exists technique. (This route is also indicated in [3].) a small set A of vertices that generates all of V . Proposition III.2. Let C be a class of colored graphs, Lemma IV.1. There exists a set A ⊂ V with |A| = closed under isomorphisms, and R a canonical refine- O(log n) such that A = V . ment operator over C. Let G ∈Chave n vertices. Sup- Proof of Theorem III.3: Observe that by construc- pose depth- stabilization results in a color-t-bounded tion, for any set B ⊆ V , each vertex in G(B) has a graph. Then we can compute a canonical form of G in O t pair of nonadjacent neighbors in B. Therefore, no set time T (R)n + ( ), where T (R) denotes the cost of one of μ +1 vertices in G(B) can have the same set of execution of R on a colored graph on n vertices. neighbors in B. The proof is analogous to the proof of Prop. II.1. By Lemma IV.1, there is some set A ⊆ V with |A| = Our next result implies Part (a) of Theorem I.6. O(log n) such that A = V . Individualize A and refine the coloring until it is stable. Order the color-classes: Theorem III.3. After individualizing O(log n) vertices start with the members of A (a single vertex in each of G, naive refinement yields a color-μ-bounded graph. class), and then repeatedly apply the B → B operator to Remark III.4. We note a structural obstruction to this the union of the color-classes already listed; add the list strategy. This comment, ending with the paragraph after of color classes in B \B in any order. By the foregoing, Proposition III.5, requires the basics of group theory and our colored graph now is color-μ-bounded. is not needed for the rest of the paper. We will prove Lemma IV.1 in three stages. The Following Luks [20], let Γr denote the class of those first and largest step is to generate a set A containing groups L which have a subgroup chain L = L0 ≥ L1 ≥ a positive fraction of each of the neighborhoods of ··· ≥ Lm = {1} (the “≤” sign placed between groups two vertices, starting from a set A of logarithmic means “subgroup”) such that for all i ≥ 1 we have size (Lemma IV.4). This is proved inductively; in the |Li−1 : Li|≤r. Equivalently, a group L belongs to inductive step (Lemma IV.5), we show that a subset of Γr if all composition factors of L are isomorphic to a neighborhood of a vertex will generate a significantly subgroups of the symmetric group Sr. larger subset of a neighborhood of a vertex after indi- vidualizing two additional vertices. Proposition III.5. If depth- stabilization (with respect To complete the proof of Lemma IV.1, we prove to any canonical refinement operator) of the graph G that once we have a set A of the sort guaranteed in results in a color-t-bounded graph then Aut(G) has a Lemma IV.4, we already have that A = V . First, we ≤ Γt-subgroup of index n . show in Lemma IV.7 that A already covers a constant Ideally we would wish the conclusion (a non- fraction of the space, and then, in Lemma IV.8, we algorithmic mathematical fact) to hold for some small complete the proof. Both steps follow by counting value of  + t. It is open whether it holds with the number of edges leaving the set we have already  + t ≤ n1/5− for all notrivial, non-graphic s. r. generated. graphs. But a slightly weaker parameter that might still Two preliminary estimates and some additional no- potentially permit the application of some version of the tation are required. For x ∈ V , we write N +(x)= group theory method is polylogarithmically bounded, N(x) ∪{x}.ForA ⊆ V and y ∈ V , consider the set see Theorem IX.1. A + y := N(A \ N +(y)) ∩ N(y). (4) IV. GENERATING THE GRAPH μ VERTICES AT A TIME Clearly, |A + y|≤μ|A|. Furthermore, we note A + y ⊆ In this section we prove Theorem III.3 and thereby A ∪{y}∩N(y). part (a) of Theorem I.6. For a ∈ V (G), let N(a) denote the set of neighbors Lemma IV.2. Let L ⊂ N(A). Then of a (so a/∈ N(a)). For A ⊆ V (G), let N(A)=  Ey(|(A + y) ∩ L|) ≥|L|(k − λ − 1)/n ∼|L|(μ/k). a∈A N(a). Given a set A of vertices, we define A = A ∪ G(A), where Proof: For each u ∈ L, designate a neighbor u ∈ A.Fory ∈ V , let ϑu(y) denote the indicator of the event G(A)={u ∈ V : ∃a, b ∈ A s. t. a = b, a ∼ b, and that y ∈ N(u)\N +(u). Since |N(u)∩N +(u)| = λ+1, ∈ ∩ } u N(a) N(b) . we have Ey(ϑu)=(k − λ − 1)/n.

161 Now, if u ∈ L and ϑu =1then u ∈ (A + y) ∩ L. Lemma IV.6. Suppose μ ≥ 8. Let A ⊂ N(x) and | ∩ |≥ E | ∩ | ≥ | | So (A + y) L u∈L ϑu and y( (A + y) L ) suppose A 1 − ) ≥ /(m − 1+). In particular, P (X> Let Q = {(a, v, b) ∈ R : v/∈ N +(x)}. We claim that 1/2) ≥ 1/(2m − 1). |Q|  (99/200)|A|k(μ − 3). (5) Lemma IV.4. There exists a pair of distinct vertices x, y ∈ V and a set A with |A| = O(log n) such that The lemma then follows. Indeed, if v is such that ∈ ∈ |A ∩ N(x)|≥k/100 and |A ∩ N(y)|≥k/100. (a, v, b) Q for some a, b, then clearly v N(A). Furthermore, such a vertex v can appear at most To prove the lemma, we will show that given a set μ2 A ⊂ N(x) with |A|≤k/100, we can grow A into |N(v) ∩ N(x) ∩ A|·|N(v) ∩ N(x) \ A|≤ a new set A ⊆ N(x) whose size exceeds |A| by a 4 constant factor, and which is generated by A and two times in Q. Therefore, additional vertices; moreover, we have many choices for |N(A)|≥4|Q|/μ2  (99/50)((μ − 3)/μ)(k/μ)|A| x. Here is the precise statement. ≥ (99/80)(k/μ)|A|. Lemma IV.5 (Growth lemma). Let x ∈ V and A ⊂ E | |  N(x). Suppose |A| (9/8)|A|)  = , (80μ/99) − 1+ 80μ − 90 (Lemma IV.5)): Fix adjacent vertices y0 and z0, and let A0 = {y0,z0}. When Ai, yi, and zi have been defined, proving the lemma. and |Ai ∩N(zi)|≤k/100, by Lemma IV.5, there exists We now prove inequality (5). Define Q = a pair (yi+1,zi+1) of vertices such that, setting Ai+1 = {(a, v, b) ∈ R : v/∈ N(x)} and W = {(a, v, b) ∈ Ai ∪{yi+1,zi+1},wehave R : a ∼ b and v = x}. Thus, Q ⊂ Q , and Q \ Q is the set of triples of the form (a, x, b) where a ∈ A and |Ai ∩ N(zi )|  (9/8)|Ai ∩ N(zi)|. +1 +1 b ∈ N(x) \ A. In particular, |Q | = |Q| + |A|(k −|A|). Thus, for some t = O(log n),wehaveasetAt of For every a ∈ A, there are ≥ k − λ −|A| vertices vertices such that |At|≤2t and |At ∩ N(zt)| > b ∈ N(x) \ A with a ∼ b. For every such pair (a, b), k/100. Furthermore, Lemma IV.5 ensures that we have there are μ−1 vertices v such that (a, v, b) ∈ W . Thus, Ω(n/μ)=ω(log n) choices for each zi. We repeat since λ = o(k),wehave|W |  |A|(k −|A|)(μ − 1). \ the construction to obtain sets Ai for i ≤ t where The set W Q is the collection of triples (a, v, b) of ∈ ∈ t = O(log n) using a sequence of pairs (yi,zi) where distinct vertices in N(x) such that a A, b/A, and { } { } ∼ ∼ ∼ the set z1,...,zt is disjoint from the set z1,...,zt . (a, v, b) induces a path (i. e., a v b and a b). { } ∈ We then have |At ∩ N(zt)|≥k/100 and zt = zt. So Let F be the set of edges a, b such that a A and ∈ \ ∈ \ A = At ∪ At has the desired property. b N(x) A. For any (a, v, b) W Q , exactly one In this section we only prove the Growth lemma of the edges {a, v} and {v, b} is in F . Fix {a, b}∈F (Lemma IV.5) for μ ≥ 8. with a ∈ A and b/∈ A, and let K = K(a, b)=N(a) ∩ The case of small μ presents an added technical N(b)∩N(x). There are ≤|N(a)∩N(x)\K| = λ−|K| difficulty because in those cases, the sets as given in vertices w such that (w, a, b) ∈ W \Q . Similarly, there the proof may not grow at all, or not at a constant rate. are ≤|N(b) ∩ N(x) \ K| = λ −|K| vertices w such The proof for μ ≤ 7 will use similar ideas as the proof that (a, b, w) ∈ W \ Q . Thus, below for larger μ except we shall need to use two |W \ Q |≤ 2(λ −|K(a, b)|) additional vertices y, z per round rather than just one to {a,b}∈F nudge our sets to grow. We defer the proof of the case μ ≤ 7 to the full version of the paper. On the other hand, Q \W contains all triples (a, v, b) For μ ≥ 8, the Growth lemma follows from such that {a, b}∈F with a ∈ A and v ∈ N(a) ∩

162 \ { }∈ 2 E | ∩ | N(b) N(x). Thus, for fixed a, b F , there are Ω(μn)=Ω(k ). Thus, y∈N(x)( N(y) A )=Ω(k), exactly |N(a) ∩ N(b) \ K| = λ −|K| vertices v such and since each vertex in N(x) has at most k neighbors that (a, v, b) ∈ Q \ W . Thus, in A, it follows by Fact IV.3 that at least Ω(k) neighbors of x each have at least Ω(k) neighbors in A. Since | \ |≥ −| | Q W (λ K(a, b) ) λ = o(k), for n sufficiently large, x has more than {a,b}∈F λ +2neighbors in B. so that |Q \ W |≥|W \ Q|/2. It follows that Finally, we complete the proof of Lemma IV.1 by using Lemmas IV.4, IV.7, and IV.8 in this order. |Q | = |Q ∩ W | + |Q \ W |≥|Q ∩ W | + |W \ Q |/2 ≥|W |/2  (1/2)|A|(k −|A|)(μ − 1) . V. C LIQUE GEOMETRY AND THE λ BOUND Spielman found the following remarkable lower Since |Q| = |Q |−|A|(k −|A|),wehave|Q|  bound on the density of the subgraph induced by the (1/2)|A|(k −|A|)(μ − 3). Inequality (5) now follows, set of common neighbors of a pair of adjacent vertices. since |A| 0. Then of nonadjacent vertices in N(u) N(v).  |A|  c2n. Using this bound we can prove that for large λ, ∩ Proof: Let X = A ∩ N(x) \ N(y) and Y = A ∩ almost all vertices in N(u) N(v) induce a clique. N(y) \ N(x). Without loss of generality, assume |Y |≤ Here is the exact statement. |X|.Wehave|N(x) ∩ N(y)|≤λ and therefore |Y |≥ Corollary V.2. If kμ = o(λ2) then we have the ck − λ ∼ ck. following structure. { ∈ ∈ ∼ ∈ Define Q = (a, b, v):a X, b Y,a b, v (i) For every pair { } of adjacent vertices there ∩ } ∈ ∈  u, v N(a) N(b) . Note that if (a, b, v) Q then v A. is a unique maximal clique C(u, v) of order ∼ λ Since no vertex in Y is adjacent to x, each vertex in such that u, v ∈ C(u, v). Let us call these cliques Y has at most μ neighbors in X. Therefore special. |Q|≥(|X|−μ)|Y |μ ∼ c2k2μ. (ii) Let D(u, v)=N(u) ∩ N(v) \ C(u, v). Then the vertices of D(u, v) each have at most μ ∈ ∈ ∈ ∩ If v N(x) and (a, b, v) Q then v N(b) N(x) neighbors in C(u, v), and hence have degree o(λ) ∈ ∈ ∩ and b/N(x), while a N(v) N(x). It follows that in the subgraph of G induced on N(u) ∩ N(v). ≤ | | there are μλ Y such triples (a, b, v). Similarly, there The special cliques are therefore easily identified, ≤ | | ∈ ∈ are μλ X triples (a, b, v) Q such that v N(y). since the vertices of C(u, v) have degree ∼ λ in | ∩ |≤ For every other vertex v,wehave N(v) X μ N(u) ∩ N(v). | ∩ |≤ ≤ 2 and N(v) Y μ, so there are μ pairs a, b such (iii) If x = y are vertices in C(u, v) then C(x, y)= ∈ that (a, b, v) Q. Thus, the number of distinct vertices C(u, v). In other words, two distinct special ∈ \ ∪ ∈ v V (X Y ) such that (a, b, v) Q for some a, b cliques have at most one vertex in common. is at least We defer the proof to [13]. |Q|−μλ|X|−μλ|Y | c2k2  ∼ c2n. This structure can be viewed as an approximate μ2 μ version of the well-studied class of “partial geometries” In particular, |A|  c2n. (where all lines must have equal length and the number of lines from x to  must always be the same). The Lemma IV.8. Let A ⊆ V is such that |A| =Ω(n). special cliques are the “lines.” They are of almost equal Then for n sufficiently large, A = V . length ∼ λ; every pair of adjacent vertices belongs to Proof: Let B be the collection of vertices with at a unique line; two lines intersect in at most one point. least λ +2 neighbors in A. Clearly B ⊂ A. Thus, it Furthermore, if a point x is not on a line  then there suffices to show that every vertex x/∈ A∪B has at least are at most μ lines connecting x to points of . λ+2 neighbors in A∪B for n sufficiently large. Indeed, We now use this structure to derive our bound on λ. if x/∈ A ∪ B, then |A \ N(x)|≥Ω(n − λ)=Ω(n). A hypergraph is a pair H =(V,E) where E⊆2V is Every vertex in A \ N(x) has μ neighbors in N(x),so the set of edges. Let n = |V |. The degree deg(v) of the the number of edges between N(x) and A \ N(x) is vertex v ∈ V is the number edges containing v. We say

163 that H is nearly d-regular if deg(v) ∼ d for all v ∈ V , Next, we give unique colors to a substantial fraction and H is nearly λ-uniform if |A|∼λ for all A ∈E. of all vertices. Lemma V.3. Let H =(V,E) be a nearly d-regular, Lemma VI.3. Suppose there is a stable set A with nearly λ-uniform hypergraph with n vertices, such that |A| =Ω(n) such that for every x ∈ A there are Ω(k/ν) | ∩ | every pair of edges shares√ at most one vertex. Assume disjoint stable sets C such that N(x) C =Ω(μ). d →∞with n. Then λ  n. Then for some t = O((ν/μ)log3 n), depth-t stabiliza- tion produces Ω(n) vertices with unique colors. Proof: Count the triples (v, A, B) where v ∈ V , A, B ∈E, A = B, and v ∈ A ∩ B in two ways. First, in Lemma VI.4, we show that if a pair of Proof of Theorem I.3 (c): Assume k =Ω(n2/3). vertices has Ω(k/ν) different colors appearing in the Note that μ ∼ k2/n =Ω(n1/3),sokμ =Ω(n).We symmetric difference of their neighborhoods, then with need to show that λ = O((kμ)1/2). Assume instead high probability they will receive different colors in the  that kμ = o(λ2). Let H be the hypergraph on V whose stable refinement after individualizing O(ν/μ) random edges are the “special cliques” C(u, v). Then H is vertices. nearly λ-uniform and nearly k/λ-regular. It follows by Lemma VI.4. For any δ>0 there is some M>0 such Lemma V.3 that λ2  n = O(kμ), contradicting the ∈ 2 that the following holds. Suppose x, y V are such assumption that kμ = o(λ ). that N(x)\N(y) intersects at least δk/ν color classes. 2  Individualize at least M(ν/μ)log n random vertices. VI. THE exp(O(1 + λ/μ)) BOUND Then for n sufficiently large, x and y get different colors In this section, we will outline a proof of Theo- in the stable refinement with probability at least 1/2. rem I.6 (c). An alternative proof will be outlined in Proof sketch: By taking intersections with the Section VII. neighborhoods of random vertices, we iteratively shrink In view of Prop. II.1, the theorem follows from the the color classes in  . With high probability, { } N(x) N(y) following lemma. We set ν =max λ, μ . some color class shrinks to a single vertex after O(ν/μ) Lemma VI.1. Assume ν = ω(log n). For some d = such intersections. O((ν/μ)log3 n), depth-d stabilization completely splits Sketch of proof of Lemma VI.3: The hypotheses G. of Lemma VI.4 are satisfied by pairs of nonadjacent vertices in the set A of Lemma VI.3, and so indi- The overall strategy of the proof of Lemma VI.1 is vidualizing some set of O(ν/μ) random vertices will similar to that of [15]. By individualizing vertices in suffice to partition the set A of Lemma VI.3 into color three stages, we gradually refine the coloring of G.In classes of size at most λ in the stable refinement. This the first stage, we ensure that for a positive fraction coloring of A will induce a relatively fine coloring of vertices x, the neighborhood N(x) intersects many of neighborhoods for those vertices which intersect A color classes. Here is a precise statement. We say that substantially. Hence, by repeating the argument, we can a subset T of a colored set is stable if T is a union of give different colors to every pair of vertices whose color-classes. neighborhoods have large intersection with A. 2 Lemma VI.2. For some t = O((ν/μ)log n), depth-t Lemma VI.5. Suppose Ω(n) vertices have unique col- | | stabilization yields a stable set A with A =Ω(n) such ors. Then for n sufficiently large, G is completely split ∈ that for every x A there are Ω(k/ν) disjoint stable in the stable refinement. sets C such that |N(x) ∩ C| =Ω(μ). The proof is similar to that of Lemma IV.8. Proof sketch: We show that with a single individ- Proof of Lemma VI.1: Apply Lemmas VI.2, VI.3, ualization, we can either increase the number of color and VI.5, in that order. classes by a factor of (1+Ω(μ/(ν polylog(n)))), or else The details of the proofs are left to the full paper. obtain a collection of Ω(k/ν) color classes of bounded size. By iterating this process O(ν/μ) times, we ensure VII. BIPARTITE STRUCTURE that the latter event obtains. The neighborhoods of these In this section we outline an alternative proof of bounded-size color classes will be the stable sets C in Theorem I.6 (c). As in [16] (as well as in [7], [9]), the statement of the lemma. The set of all vertices x ∈ V the overall strategy is to distinguish pairs of vertices. such that |N(x) ∩ C| =Ω(μ) is stable, and we show Given a s. r. graph, we achieve this by building a that the number of such x is Ω(n). family of bipartite structures that can be utilized in a

164 multi-stage analysis. When aiming at distinguishing two all vertices in V \ N +(u), and (2) the A-part of the vertices u = v, we simultaneously grow two sequences new bipartite system is smaller than the old one by a of bipartite structures, one for u and one for v, with factor of O(n−Ω(1)). Thus, a constant number of steps the assistance of some random vertices (seeds). At is sufficient to obtain the desired structures. each step, the bipartite structures have the following We show that our construction of bipartite systems properties: either their interactions with a small number is isomorphism-invariant, and that O(1 + λ/μ) random of random seeds can introduce the desired asymmetry seeds are sufficient to distinguish all pairs of vertices with high probability, or their interactions with “not with high probability. As a result, this gives an alter- too many” random seeds likely produce another pair of native algorithm for testing isomorphism of s. r. graphs bipartite structures with measurable progress. Note that that satisfy Neumaier’s claw bound with running time in order to perform multilevel analyses of non-trivial exp(O(1 + λ/μ)). For technical reasons, our analysis s. r. graphs whose diameter is 2 it is usually essential to works only for s. r. graphs of k = O(n1−) for some build substructures within a s. r. graph that can stretch. arbitrary constant >0. For larger k we refer to the For a vertex u, we focus on the induced bipartite bound (2) [7]. We defer the details to the full paper. \ + subgraph between N(u) and V N (u). Specifically, VIII. THE COMBINATORIAL 9/37 BOUND we focus on a family of bipartite systems, each con- We now outline the proof of inequality (3). We state sisting of a sequence of induced bipartite subgraphs the algorithmic result which implies (3). ((A1,B1),...,(Aγ ,Bγ )), where the Ai are disjoint subsets of N(u) and Bi ⊆ V \ N +(u). For our Theorem VIII.1. If G and H are non-trivial, non- construction and analysis, in addition to demanding that graphic s. r. graphs then Iso(G, H) can be listed in time  / every (Ai,Bi) be dense enough, we further require that exp(O(n9 37)). (i) the sizes of all the Ai be within a factor of 2 of each  This section is not required for our main algorithmic other, (ii) all degrees involved by vertices in Bi be i result. within a factor of 2 of each other, (iii) the numbers of  The proof follows from a combination of results Bi to which each vertex in Bi belongs be within a i stated in the Introduction and the following lemma that constant factor of each other. gives a strong combinatorial bound for small k. With these strong “regularity” conditions, we have the following property: Assume ((Ai,Bi)) and ((Ai,Bi)) Lemma VIII.2. Assume G satifies the claw bound. ≤ 7/13 are a pair of bipartite structures built for u and v, respec- Suppose k n / log n. Then for some d = 17/3 8/3 5 tively. If |Ai∩Ai| = o(|Ai|) and |Ai| (= |Ai|) is smaller O((k /n )log n), depth-d stabilization com- | ∩ | | | pletely splits G. than k/max( λ, μ), then we have Bi Bi = o( Bi ). | | Thus, if i Bi is very close to n, then the interaction Proof of Theorem VIII.1 from the Lemma: As of a small number of random seeds w with i Bi and in the derivation of Theorem I.1 near the end of Sec- i Bi is likely to produce the asymmetry that we aim tion I-F, we quickly reduce to the case when G satisfies ∈ ∈ for: for some i, w Bi but w/Bi. Neumaier’s claw bound. We then use Lemma VIII.2 Our initial bipartite systems for u, v are simply (as per Prop. II.1) for the case when k ≤ n19/37, then ((N(u),V \ N +(u)) and ((N(v),V \ N +(v)), respec- Theorem I.6 (b)uptok ≤ n/ log n, and finally (2) for tively, which do not meet the size condition above since k ≥ n/ log n (see Remark I.7). The bottleneck for the |N(u)| = |N(v)| = k is greater than k/ max(λ, μ). automorphism bound arises at k ∼ n19/37. To make progress, we draw a small number of random Outline of the proof of Lemma VIII.2: The Lemma seeds and use the following process to partition a follows the high-level strategy of [7], [9], and [16]. We bipartite system ((Ai,Bi)) to ((Aj,Bj)). For a seed show that, for each pair of vertices u, v ∈ V ,ifd z,ifz ∈ Bi for some i, then we extract a new (as given in the statement of Lemma VIII.2) vertices induced bipartite graph (A,B), where A contains all are sampled uniformly at random and individualized, the vertices of Ai that are also neighbors of z, and then with high probability, u and v will receive distinct B contains all neighbors of A in V \ N +(u).We colors after three steps of refinement. Lemma VIII.2 collect all such new induced bipartite graphs and then, then follows by a union bound. “clean up” the bipartite graphs to make the new bipartite To this end, we show that, with high probability, there system satisfy the desired “regularity” conditions. Our are three individualized vertices w1,w2 and w3 such that two goals are to ensure that (1) the union of the B- the following event holds for u but does not hold for part of the new bipartite system still contains almost v: There exist two vertices p, q such that w2 and w3

165 are both neighbors of q; w1 and q are both neighbors [4] L. Babai, W. M. Kantor, and E. M. Luks, “Computa- of p; p is a neighbor of u (but no such pair of vertices tional complexity and the classification of finite simple exists for v). It is then clear that, after three steps of groups,” In: 24th FOCS, pp. 162–171, 1983. refinement, u and v receive distinct colors. [5] V. N. Zemlyachenko, N. M. Korneenko, and R. I. We defer the details of the proof to [17]. Tyshkevich, “Graph isomorphism problem,” Zapiski Nauchn. Semin. LOMI, vol. 118, pp. 83–158, 215, 1982. IX. OPEN PROBLEMS [6] R. C. Read and D. G. Corneil, “The graph isomorphism The main open problem in the area is to reduce the disease,” J. Graph Th., vol. 1, no. 4, pp. 339–363, 1977. complexity of GI to exp(n0.49). The target for isomor- phism of s. r. graphs is subexponential (exp(no(1)))or [7] L. Babai, “On the complexity of canonical labeling of even quasipolynomial (exp((log n)O(1))). This may not strongly regular graphs,” SIAM J. Comput., vol. 9, no. 1, pp. 212–216, 1980. be entirely out of reach using the group theory method in the light of the following new result. [8] ——, “On the order of uniprimitive permutation groups,” Annals of Math., vol. 113, no. 3, pp. 553–568, 1981. Theorem IX.1 ([21]). Suppose the alternating group At is involved in Aut(G) (as a quotient of a subgroup) for [9] D. A. Spielman, “Faster isomorphism testing of strongly a nontrivial, non-graphic s. r. graph G with n vertices. regular graphs,” In: 28th STOC, pp. 576–584, 1996. 2 Then t = O((log n) / log log n). [10] J. J. Seidel, “Strongly regular graphs with (−1, 1, 0)- adjacency matrix having eigenvalue 3,” Linear Algebra This implies that any primitive permutation group Appl., vol. 1, pp. 281–298, 1968. involved in Aut(G) has quasipolynomially bounded order, a key condition for the quasipolynomial-time [11] A. Neumaier, “Strongly regular graphs with smallest −m efficiency of Luks’s divide-and-conquer. However, the eigenvalue ,” Arch. Math., vol. 33, no. 4, pp. 392– 400, 1979. obstacles to applying Luks’s method are still consid- 2 1 1 erable, given that s. r. graphs lack an evident recursive [12] ——, “Quasiresidual -designs, 2 -designs, and structure (substructures that satisfy the same constraint strongly regular multigraphs,” Geom. Dedicata, vol. 12, on their automorphism groups). no. 4, pp. 351–366, 1982. A major open problem, related to an obstruction to [13] L. Babai and J. Wilmes, “Asymptotic Delsarte cliques in the I/R method without group theory, is a subexponen- strongly regular graphs,” in preparation. tial bound on | Aut(G)| (see the bound (3) and the [14] G. L. Miller, “On the nlog n isomorphism technique: A paragraph preceding it in Sec. I-E). preliminary report,” In: 10th STOC, pp. 51–58, 1978. Among the open cases of isomorphism of s. r. graphs we should highlight the line-graphs of partial geome- [15] L. Babai and J. Wilmes, “Quasipolynomial-time canon-  / tries; nothing better than our exp(O(n1 5)) is known ical form for Steiner designs,” In: 45th STOC, pp. 261– 270, 2013. for them in spite of their geometric structure. [16] X. Chen, X. Sun, and S.-H. Teng, “Multi-stage design ACKNOWLEDGMENTS for quasipolynomial-time isomorphism testing of Steiner Partial support by NSF grants is acknowledged as 2-systems,” In: 45th STOC, pp. 271–280, 2013. follows: L. Babai CCF-1017781, X. Chen and X. Sun [17] ——, “A new bound on the order of the automorphism CCF-1149257, S-H. Teng 1111270 and 0964481, J. group of strongly regular graphs,” in preparation. Wilmes DGE-1144082. X. Chen also acknowledges a Sloan Fellowship. [18] G. L. Miller, “Isomorphism of graphs which are pairwise k-separable,” Information and Control, vol. 56, no. 1-2, REFERENCES pp. 21–33, 1983. [1] O. Goldreich, S. Micali, and A. Wigderson, “Proofs that [19] B. Weisfeiler, Ed., On construction and identification of yield nothing but their validity or all languages in NP graphs, ser. Lecture Notes in Mathematics. Springer- have zero-knowledge proof system,” J. ACM, vol. 38, Verlag, 1976, vol. 558. no. 1, pp. 691–729, 1991. [20] E. M. Luks, “Isomorphism of graphs of bounded valence [2] L. Babai and S. Moran, “Arthur-Merlin games: a ran- can be tested in polynomial time,” J. Comp. Syst. Sci., domized proof system, and a hierarchy of complexity vol. 25, no. 1, pp. 42–65, 1982. classes,” J. Comp. Syst. Sci., vol. 36, pp. 254–276, 1988. [21] L. Babai, “Alternating groups involved in the auto- [3] L. Babai and E. M. Luks, “Canonical labeling of graphs,” morphism groups of strongly regular graphs,” 2013, in In: 15th STOC, pp. 171–183, 1983. preparation.

166