The Power of Algorithmic Approaches to the Graph Isomorphism Problem

Von der Fakult¨atf¨urMathematik, Informatik und Naturwissenschaften der RWTH Aachen University zur Erlangung des akademischen Grades eines Doktors der Naturwissenschaften genehmigte Dissertation

vorgelegt von

Daniel Neuen, Master of Science aus Dormagen

Berichter: Universit¨atsprofessorDr. Martin Grohe Universit¨atsprofessorDr. Pascal Schweitzer Universit¨atsprofessorDr. L´aszl´oBabai

Tag der m¨undlichen Pr¨ufung:17. Dezember 2019

Diese Dissertation ist auf den Internetseiten der Universit¨atsbibliothek online verf¨ugbar. ii

Abstract The Graph Isomorphism Problem asks, given two input graphs, whether they are structurally the same, that is, whether there is a renaming of the vertices of the first graph in order to transform it to the second graph. By a recent breakthrough result of Babai (STOC 2016), this problem can be solved in quasipolynomial time. However, despite extensive research efforts, it remains one of only few natural problems in NP that are neither known to be solvable in polynomial time nor known to be NP- complete. Over the past five decades several powerful techniques tackling the Graph Isomor- phism Problem have been investigated uncovering various surprising links between different approaches. Also, the situation has led to a number of algorithms solving the isomorphism problem on restricted classes of input graphs. In this thesis, we continue the investigation of various standard approaches to the Graph Isomorphism Problem to further broaden our understanding on the power and limits of such approaches. In particular, this leads to several improved algorithms solving the isomorphism problem for important restricted classes of graphs. One of the most fundamental methods in the context of graph isomorphism test- ing is the Weisfeiler-Leman algorithm, which iteratively computes an isomorphism- invariant coloring of vertex-tuples. While the algorithm is unable to decide the isomorphism problem itself, it is commonly used as a subroutine and, for various restricted graph classes, it already serves as a complete isomorphism test. In the latter direction we prove for example that the Weisfeiler-Leman dimension of graph classes of bounded rank-width is bounded. While it was already known that the isomorphism problem for graphs of rank-width at most k is polynomial-time solvable (Grohe and Schweitzer, FOCS 2015), the previous best algorithm is complicated and the exponent of the running time depends non-elementary on k. In contrast, our analysis of the Weisfeiler-Leman algorithm yields a simple isomorphism test running in time nO(k). A framework closely related to the Weisfeiler-Leman algorithm and which works particularly well in practice is the Individualization-Refinement paradigm. Extending our understanding on the limits of combinatorial approaches we provide the first exponential lower bounds on the worst case complexity of a large and natural class of algorithms within this framework. In particular, this includes all practical state- of-the-art isomorphism tools answering an open question from Babai (STOC 2016) on the worst-case complexity of such solvers. A second crucial approach to the Graph Isomorphism Problem is based on group- theoretic techniques. In this direction, one of the algorithmic cornerstones is Luks’s polynomial time algorithm for testing isomorphism of bounded degree graphs (JCSS 1982). Adapting the novel group-theoretic methods by Babai developed for his quasipolynomial time isomorphism test (STOC 2016) we give an isomorphism test for graphs of maximum degree d running in time npolylog(d). This significantly improves over the previous best isomorphism test for graphs of maximum degree d running in time nO(d/ log d) (Babai, Kantor and Luks, FOCS 1983). With Luks’s algorithm being used as a subroutine in a number of other algo- rithms it is natural to ask for the consequences of this improvement. Besides simple applications regarding structures of small degree, we present an isomorphism test for graphs of tree-width k running in time 2k polylog(k) poly(n) improving the fixed- parameter tractable algorithm of Lokshtanov et al. (FOCS 2014) running in time 5 2O(k log k) poly(n). iii

Zusammenfassung Das Graphisomorphieproblem fragt, gegeben zwei Graphen, ob diese strukturell iden- tisch sind, d.h. ob man durch Umbenennung der Knoten des ersten Graphen diesen in den zweiten Graphen transformieren kann. Mit dem j¨ungstenDurchbruch von Babai (STOC 2016) kann dieses Problem in quasipolynomieller Zeit gel¨ostwerden. Jedoch bleibt es, trotz intensiver Forschung, eines von nur wenigen nat¨urlichen Problemen in NP, f¨urdas weder ein Polynomialzeitalgorithmus noch die NP-Schwere bekannt ist. In den letzten f¨unfJahrzehnten wurden vielf¨altige algorithmische Techniken f¨ur das Graphisomorphieproblem erforscht, was ¨uberraschende Verbindungen zwischen verschiedenen Ans¨atzenhervorgebracht hat. Außerdem f¨uhrtedies zu zahlreichen Algorithmen, die das Isomorphieproblem auf eingeschr¨anktenKlassen von Eingabe- graphen l¨osen. Diese Dissertation verfolgt das Ziel unser Verst¨andnis¨uber St¨arken und Grenzen verschiedener wichtiger Ans¨atzezum Graphisomorphieproblem zu er- weitern. Dies resultiert insbesondere in mehreren verbesserten Algorithmen die das Isomorphieproblem f¨urwichtige Klassen von Graphen l¨osen. Eine der grundlegendsten Methoden f¨urdas Testen von Isomorphie ist der Weisfei- ler-Leman Algorithmus, der iterativ eine isomorphieinvariante F¨arbungder Knotentu- pel berechnet. Obwohl dieser Algorithmus das Isomorphieproblem nicht alleine l¨osen kann, wird er regelm¨aßigals Unterprogramm verwendet und dient als vollst¨andiger Isomorphietest f¨urmehrere eingeschr¨ankteKlassen von Graphen. In diesen Zusam- menhang zeigen wir, dass die Weisfeiler-Leman Dimension von Graphen beschr¨ankter Rangweite beschr¨anktist. W¨ahrendein Polynomialzeitalgorithmus f¨urdas Isomor- phieproblem von Graphen mit Rangweite h¨ochstens k bereits vorher bekannt war (Grohe und Schweitzer, FOCS 2015), so sind bisherige Algorithmen kompliziert und der Exponent der Laufzeit h¨angtnicht-elementar von k ab. Im Gegensatz dazu ergibt unsere Analyse des Weisfeiler-Leman Algorithmus einen einfachen Isomorphietest mit Laufzeit nO(k). Ein mit dem Weisfeiler-Leman Algorithmus eng verwandter Ansatz, der besonders gut in der Praxis funktioniert, ist das Paradigma des Individualisierens und Verfei- nerns. Um unser Verst¨andnisder Grenzen kombinatorischer Ans¨atzezu erweitern, geben wir die ersten exponentiellen unteren Schranken f¨urdie Laufzeit einer großen und nat¨urlichen Klasse von Algorithmen innerhalb dieses Paradigmas. Insbeson- dere erhalten wir exponentielle untere Schranken f¨urdie Laufzeit s¨amtlicher moder- ner praktischer Isomorphieprogramme und beantworten damit eine offene Frage von Babai (STOC 2016). Ein zweiter fundamentaler Ansatz f¨urdas Graphisomorphieproblem ist die Ver- wendung gruppentheoretischer Methoden. In dieser Hinsicht ist Luks Algorithmus f¨ur das Testen von Isomorphie von Graphen beschr¨anktenGrades einer der Grundpfeiler der algorithmischen Theorie des Graphisomorphieproblems. Indem wir die grup- pentheoretischen Methoden, die Babai f¨urseinen Quasipolynomialzeitalgorithmus entwickelt hat, anpassen, erhalten wir einen Isomorphietest f¨urGraphen mit Ma- ximalgrad h¨ochstens d mit einer Laufzeit von npolylog(d). Dies stellt eine deutliche Verbesserung im Vergleich zum bisher schnellsten Algorithmus f¨urdieses Problem dar, welcher nO(d/ log d) Schritte ben¨otigt(Babai, Kantor und Luks, FOCS 1983). Da Luks Algorithmus als Unterprogramm f¨ureine Reihe weiterer Algorithmen dient, stellt sich die Frage welche Konsequenzen sich aus obiger Verbesserung ergeben. Neben einfachen Anwendungen f¨urdas Isomorphieproblem von relationalen Struk- turen mit kleinem Grad zeigen wir, dass man Isomorphie von Graphen mit Baumweite h¨ochstens k in Zeit 2k polylog(k) poly(n) testen kann und verbessern damit den FPT- 5 Algorithmus von Lokshtanov et al. (FOCS 2014) mit Laufzeit 2O(k log k) poly(n). iv v

Acknowledgments First and foremost, I am grateful to my supervisor Martin Grohe for his guidance and support during the time I spent in his group researching for this thesis. Also, I want to thank Pascal Schweitzer for his continued support and for always having an open door when still being at RWTH Aachen University. Moreover, I would like to thank my colleagues Sandra Kiefer and Daniel Wiebking for joint collaborations and for proof-reading parts of this thesis. I am particularly thankful to Sandra Kiefer for sharing an office all these years and making the time at our group more joyful. Finally, I am grateful to my family, and in particular my parents, for their endless support throughout my time as an undergraduate and graduate student. vi Contents

1 Introduction 1

2 Isomorphism and Combinatorial Algorithms 11 2.1 Graphs and Isomorphism ...... 11 2.1.1 Graphs and Notation ...... 11 2.1.2 Tree Decompositions and Tree-Width ...... 12 2.1.3 Isomorphisms ...... 13 2.1.4 Isomorphism Invariance and Canonization ...... 14 2.2 The Weisfeiler-Leman Algorithm ...... 16 2.2.1 The Algorithm ...... 16 2.2.2 Pebble Games ...... 19 2.2.3 Connection to Logic ...... 20 2.3 Individualization-Refinement ...... 21 2.3.1 The Basic Paradigm ...... 21 2.3.2 Pruning with Invariants ...... 23 2.3.3 Pruning with Automorphisms ...... 23

3 Upper Bounds on the WL Dimension 25 3.1 Tree-Width ...... 25 3.2 Rank-Width ...... 30 3.2.1 Definition and Properties ...... 30 3.2.2 Split Pairs and Flip Functions ...... 32 3.2.3 A Recursive Strategy for Spoiler ...... 36 3.3 Further Results ...... 41

4 Lower Bounds 43 4.1 Weisfeiler-Leman Algorithm ...... 43 4.2 The I/R-Method in Theory ...... 48 4.2.1 A Framework for a Lower Bound ...... 49 4.2.2 The Multipede Construction ...... 50 4.2.3 The Weisfeiler-Leman Refinement and Closure Operators ...... 53 4.2.4 Meager Graphs ...... 56 4.2.5 Lower Bounds for I/R-Algorithms ...... 58 4.3 The I/R-Method in Practice ...... 61

vii viii CONTENTS

5 Group Theory 65 5.1 Permutation Groups ...... 65 5.1.1 Basics ...... 65 5.1.2 Algorithms for Permutation Groups ...... 67 5.1.3 Groups with Restricted Composition Factors ...... 68 5.2 String Isomorphism ...... 69 5.2.1 Graphs of Bounded Color Class Size ...... 69 5.2.2 String Isomorphism Problem ...... 70 5.2.3 Recursion Mechanisms ...... 71 5.3 Bounded Degree Graphs and Group Theory ...... 74 5.4 Primitive Groups ...... 76 5.4.1 The O’Nan Scott Theorem ...... 77 5.4.2 Affine Groups ...... 78 5.4.3 Non-Affine Groups ...... 79 5.4.4 A Characterization Theorem ...... 81 5.5 String Isomorphism in Quasipolynomial Time ...... 84

6 Isomorphism for Bounded Degree Graphs 89 6.1 Structure Trees ...... 89 6.1.1 Sequences of Partitions and Structure Trees ...... 90 6.1.2 Structure Graphs and Tree Unfoldings ...... 92 6.1.3 Normalizing the Action ...... 95 6.2 Affected Orbits ...... 100 6.3 Recursion ...... 102 6.4 Local Certificates ...... 104 6.4.1 The Algorithm ...... 104 6.4.2 Comparing Local Certificates ...... 106 6.4.3 Aggregating Local Certificates ...... 107 6.5 String Isomorphism ...... 112 6.6 Applications ...... 115 6.6.1 Isomorphism for Structures of Bounded Degree ...... 115 6.6.2 Coset-Labeled Hypergraphs ...... 116

7 Isomorphism for Bounded Tree-Width Graphs 119 7.1 Isomorphism-Invariant Decompositions ...... 119 7.1.1 Idea ...... 119 7.1.2 Clique Separators ...... 120 7.1.3 Decomposition of Basic Graphs ...... 121 7.2 Isomorphism Testing using Dynamic Programming ...... 126

8 Discussion 133 Chapter 1

Introduction

Two graphs G and H are called isomorphic if they are structurally the same, i.e., if there is a bijection ϕ: V (G) → V (H) which preserves the edge relation meaning that vw ∈ E(G) if and only if ϕ(v)ϕ(w) ∈ E(H) for all vertices v, w ∈ V (G). The Graph Isomorphism Problem asks, given two input graphs G and H, to decide whether G is isomorphic to H. This problem is one of only few natural problems which are contained in the complexity class NP and which are neither known to be contained in PTIME nor known to be NP-complete. Indeed, together with the Factorization Problem, the Graph Isomorphism Problem is one of the most prominent examples for such a computational problem. While the Factorization Problem is generally believed to be difficult to solve with important applications in cryptography, the status of the Graph Isomorphism Problem is still wide open despite extensive efforts to solve the problem over the past four decades (see [4, 95] for surveys). Actually, already in 1977, the phenomenon of extensively researching the problem was referred to as the Graph Isomorphism Disease in a survey article of Read and Corneil [133] and interest in the problem has not declined in the following decades underlying its significance in theoretical computer science. In 2015, the interest in the Graph Isomorphism Problem was further highlighted with a Dagstuhl Seminar solely devoted to this problem [16]. Already early on in the study of the Graph Isomorphism Problem there had been evidence suggesting the problem may not be NP-hard. For example, Babai [6] and Mathon [110] inde- pendently proved that the counting version of the problem is equivalent to its decision version which cannot be observed for any other known NP-complete problem (cf. [144]). Also, the Graph Isomorphism Problem is contained in the complexity class co-AM [59, 60] which implies the prob- lem is not NP-complete unless the polynomial hierarchy collapses to its second level [29]. But of course the most striking evidence is provided by Babai’s recent breakthrough result [11] giving a quasipolynomial time algorithm solving the Graph Isomorphism Problem (i.e. the problem can c be solved in time nO((log n) ) for some absolute constant c). This significantly improves on the √ previous best isomorphism test running in time 2O( n log n) [19] and implies the Graph Isomor- phism Problem is not NP-complete unless every problem in NP can be solved in quasipolynomial time (which would, for example, refute the Exponential Time Hypothesis [83]). However, despite the major progress made by Babai’s result, the question whether graph isomorphism is in PTIME remains wide open. With a solution to the general problem seemingly out of reach a lot of attention has been put into investigating the complexity of isomorphism testing for restricted classes of input graphs. One of the first important results in this direction was obtained by Hopcroft and Tarjan proving that the isomorphism problem for planar graphs can be solved in quasilinear time [78, 79, 80], which was later improved to linear time by Hopcroft and Wong [81]. Further examples include

1 2 CHAPTER 1. INTRODUCTION polynomial time algorithms for all graph classes of bounded tree-width [27] and graph classes of bounded Euler genus [52, 116]. More generally, Ponomarenko proved that every graph class that excludes some fixed graph as a minor admits a polynomial time isomorphism test [131]. In 1979, Babai proved that the isomorphism problem for graphs of bounded color class size can be solved in polynomial time first employing algorithmic methods related to group theory. The use of group-theoretic techniques was further extended by Luks in his seminal paper [106] giving the first polynomial-time isomorphism test for graphs of bounded degree.√ In combination with a combinatorial trick due to Zemlyachenko [151] this also led to an 2O( n log n)-time isomorphism for general graphs [19] which formed the best-known algorithm for the general problem for over three decades. Moreover, the methods developed by Luks also form the basis for Babai’s recent quasipolynomial time algorithm [11]. One of the most general results concerning the tractability of the isomorphism problem for restricted graph classes has been given by Grohe and Marx presenting polynomial-time isomor- phism tests for all graph classes excluding a fixed graph as a topological subgraph [67]. In particular, this includes all graph classes excluding some minor and all graph classes of bounded degree. One may note that, up to this point, all graph classes considered contain only sparse graphs. Graph classes admitting polynomial-time isomorphism tests also containing dense graphs include interval graphs [36], unit square graphs [122] and, maybe most notably, graph classes of bounded rank-width [73]. Finally, as a last example, the isomorphism problem for graphs with bounded eigenvalue multiplicity can also be solved in polynomial time [18, 56]. On the other hand, for graph classes such as bipartite graphs, chordal graphs, or k-degenerate graphs, it is known that the isomorphism problem is as difficult as the isomorphism problem for the class of all graphs (see, e.g., [28]). In this case the isomorphism problem for such a graph class is called GI-complete. Indeed, for most natural graph classes it is known that the isomorphism problem is either solvable in polynomial time or it is GI-complete. However, assuming the Graph Isomorphism Problem is not contained in PTIME there also exist graph classes for which the isomorphism problem is neither solvable in polynomial time nor GI-complete [127]. Besides classifying graph classes into tractable and non-tractable classes with respect to the isomorphism problem and trying to optimize running times for the tractable cases, another line of research is to investigate the complexity with respect to different cost measures such as space complexity. In this direction, Datta et al. proved that the isomorphism problem for planar graphs can be solved in LOGSPACE [42] which was later generalized to graph classes of bounded genus by Elberfeld and Kawarabayashi [47]. Moreover, the same holds for all graph classes of bounded tree-width [48]. Since already the isomorphism problem for trees is hard for the complexity class LOGSPACE (under many-one AC0-reductions) [84] the isomorphism problem for the aforementioned classes is actually LOGSPACE-complete. When analyzing the algorithms for isomorphism testing mentioned above it is notable that similar subroutines are used in many of the algorithms. Indeed, two fundamental algorithmic approaches to the Graph Isomorphism Problem, that are exploited for a number of algorithms tackling the problem, are combinatorial partition-refinement techniques and group-theoretic ap- proaches. In the context of partition-refinement techniques, one of the most important algorithms is the Weisfeiler-Leman algorithm (see, e.g., [82, 148, 149]) which is a heuristic algorithm trying to distinguish between graphs based on certain combinatorial properties. For group-theoretic techniques tackling the isomorphism problem, an important algorithmic milestone is Luks’s al- gorithm [106] solving the isomorphism problem for graphs of bounded degree in polynomial time. Both approaches have been extensively studied in the past decades leading to a variety of results also connecting the Graph Isomorphism Problem to other areas of computer science. Also, in combination, these two approaches lay the foundation of Babai’s quasipolynomial time isomor- phism test which builds on novel group-theoretic subroutines as well as insights on the power of 3 combinatorial partition-refinement methods such as the Weisfeiler-Leman algorithm. The main purpose of this thesis is to further expand our understanding on the power and the limits of these fundamental approaches. This leads to various improved algorithms for testing isomorphism on important classes of graphs.

Weisfeiler-Leman Algorithm and Related Combinatorial Approaches One of the most fundamental subroutines in the context of the Graph Isomorphism Problem is the Weisfeiler-Leman algorithm. For every k ≥ 1 there is a k-dimensional variant of the algorithm that colors, for a given graph G, the k-tuples of vertices of G and iteratively refines the coloring in an isomorphism-invariant way. Originally, the algorithm was introduced only for dimension two by Weisfeiler and Leman [149]. The k-ary version, k ≥ 1, was introduced by Babai and Mathon [7] and independently by Immerman and Lander [82]. Already the 1-dimensional Weisfeiler-Leman algorithm, also referred to as the Color Re- finement algorithm, which iteratively refines an initially uniform coloring by counting for each vertex the number of neighbors of a certain color, is a quite powerful tool in the context of the Graph Isomorphism Problem and is used as a subroutine in a number of algorithms (see, e.g., [14, 21, 23, 112, 113, 142]). In particular, the Color Refinement algorithm already manages to solve the isomorphism problem for random graphs asymptotically almost surely [17]. Whereas the Color Refinement algorithms fails to decide isomorphism of regular graphs, the 2-dimensional Weisfeiler-Leman algorithm can be used to solve the problem for random regular graphs asymp- totically almost surely [98]. Also, the 2-dimensional Weisfeiler-Leman algorithm is closely tied to coherent configurations which are for example studied in algebraic combinatorics (see, e.g., [34]). For higher dimensions, the algorithm is for example prominently applied in Babai’s quasipolyno- mial time isomorphism test for dimension k = O(log n) (where n denotes the number of vertices of the input graphs). There are several characterizations of the Weisfeiler-Leman algorithm that connect it to other areas of theoretical computer science. First, the expressive power of the algorithm can be characterized in terms of bounded-variable fragments of first-order logic with counting [82] which connects the Weisfeiler-Leman algorithm to finite model theory and descriptive complexity theory (see, e.g., [64]). Also, this connection can be used to capture the power of Weisfeiler- Leman algorithm in terms of certain pebble games [76]. More recently, it has been observed that the power of the algorithm also corresponds to Sherali-Adams relaxations of the natural linear programs for the Graph Isomorphism Problem [5, 71]. This result inspired work also relating the Weisfeiler-Leman algorithm to semi-definite programming [126, 140] and algebraic approaches (e.g. Gr¨obnerBasis) [25, 26, 72]. Furthermore, the power of the Weisfeiler-Leman algorithm can be characterized by certain homomorphism counts from graphs of bounded tree-width [44]. Finally, in recent years, the Weisfeiler-Leman algorithm has also been exploited in a machine learning context for graph classification problems [139] (see also [97, 125]). In this direction, the power of the Color Refinement algorithm corresponds to that of certain graph neural networks [120]. A common way to investigate the strength of the Weisfeiler-Leman algorithm is to determine the dimension of the algorithm that is required to build a complete isomorphism test for certain graphs G. In this sense, following Grohe [64], the Weisfeiler-Leman dimension of a graph G is the minimal number k such that the k-dimensional Weisfeiler-Leman algorithm identifies G. By the celebrated seminal paper of Cai, F¨urerand Immerman [31] there is no fixed dimension for which the Weisfeiler-Leman algorithm solves the Graph Isomorphism Problem on its own. Indeed, there are non-isomorphic n-vertex graphs G and H which the k-dimensional Weisfeiler-Leman algorithm cannot distinguish unless k = Ω(n). However, when focusing on a particular class of 4 CHAPTER 1. INTRODUCTION graphs, it is often the case that the Weisfeiler-Leman algorithm serves as a complete isomorphism test for some fixed dimension k. Since the Weisfeiler-Leman algorithm can be implemented in polynomial time for every fixed dimension k (see, e.g., [82]) this immediately gives a polynomial time isomorphism test for the graph class in question. Indeed, for several classes admitting polynomial time isomorphism tests mentioned above the Weisfeiler-Leman algorithm builds a complete isomorphism test giving a unifying method to tackle the isomorphism problem for these classes. For example, this includes planar graphs [90], graphs of bounded tree-width [66], graphs of bounded genus [62, 65], and more generally, all classes that exclude a fixed graph as a minor [63]. Also, the class of interval graphs has finite Weisfeiler-Leman dimension [51]. In this thesis we investigate the Weisfeiler-Leman dimension of graphs of bounded tree-width and rank-width. For the case of tree-width Grohe and Mari˜no[66] proved that the Weisfeiler- Leman dimension of the class of graphs of tree-width at most k is upper-bounded by k + 2. In this thesis we present a more careful implementation of their general strategy resulting in an improved upper bound of k confirming a conjecture of Grohe [64]. More importantly, we extend the high-level strategy for bounding the Weisfeiler-Leman di- mension of graphs of bounded tree-width to graphs of bounded rank-width. Originally introduced by Oum and Seymour [130], rank-width is another graph parameter measuring the width of a cer- tain style of hierarchical decomposition. However, in contrast to tree-width which measures the complexity of a separation in the hierarchical decomposition in terms of connectivity, rank-width measures this complexity in terms of the rank of the adjacency matrix of the edges of the two sides of the separation. This makes rank-width almost closed under complementation and also allows for dense graphs to have small rank-width (in particular, every complete graph has rank- width 1). Rank-width is closely related to clique-width, another graph parameter measuring the structural difficultly of a graph. For every graph G it holds that rw(G) ≤ cw(G) ≤ 2rw(G)+1 − 1 [130]. In particular, a class of graphs has bounded rank-width if and only if it has bounded clique-width. This means that many NP-hard problems can be solved efficiently for graphs of bounded rank-width [40, 49]. For the Graph Isomorphism Problem, the first polynomial time algorithm for graphs of bounded rank-width was presented by Grohe and Schweitzer [73]. However, the running time of their algorithm is nf(k) where n denotes the number of vertices, k is the rank-width of the inputs graphs, and f is a non-elementary function. Besides the unsatisfactory running time of the al- gorithm, it is also rather complicated building on advanced results from structural graph theory [74] and computational group theory [106]. In this thesis we prove that the Weisfeiler-Leman dimension of graphs of rank-width k is at most 3k + 4 which adds a rich family of dense graph classes to the picture of graph classes of bounded Weisfeiler-Leman dimension. Also, this results in a simple isomorphism test for graphs of bounded rank-width which, maybe surprisingly, it also significantly faster than the isomorphism test of Grohe and Schweitzer. On top of that, we can use this result to give a generic polynomial time canonization algorithm for graphs of bounded rank-width. A canonization algorithm A for a class C maps a graph G to a graph A(G) =∼ G that solely depends on the isomorphism type of G and not on G itself. More formally, it holds that A(G) = A(H) if and only if G =∼ H for all graphs G, H ∈ C. Clearly, the isomorphism problem reduces to the problem of giving a canonization algorithm; the converse is not known. Previously, it was unknown whether such an algorithm running in polynomial time exists for graphs of rank-width at most k. Having provided upper bounds on the Weisfeiler-Leman dimension of graphs of bounded tree- width and rank-width, a natural question is to ask for lower bounds on the Weisfeiler-Leman dimension of these two classes. Looking at the result of Cai, F¨urerand Immerman [31] it is not difficult to see that both upper bounds are tight up to a constant factor. However, aiming to exactly determine the Weisfeiler-Leman dimension of the graph classes, it would be desirable 5 to obtain lower bounds that are as close as possible to the upper bounds described above. In this direction, Dawar and Richerby [43] showed that the Weisfeiler-Leman dimension of the Cai- F¨urer-Immermangraphs is closely tied to the tree-width of the base graphs. Exploiting this connection we present a more refined analysis on the tree-width of certain Cai-F¨urer-Immerman graphs leading to improved lower bounds on the Weisfeiler-Leman dimension that are only a small constant factor away from the upper bounds for graph classes of bounded tree-width and rank-width. Another, closely related, combinatorial approach to the Graph Isomorphism Problem, that works particularly well in practice, is the Individualization-Refinement paradigm (I/R paradigm). In a nutshell, the basic principle of an I/R algorithm is to first refine an initially uniform coloring of the vertices of the input graphs in an isomorphism-invariant manner. A typical choice for this subroutine is the Color Refinement algorithm. In case the produced coloring is discrete (i.e., every color class contains only a single vertex) the isomorphism problem can be solved in a straightforward way. Otherwise, vertices in a chosen color class are individualized one by one in a backtracking manner in order to artificially distinguish them from the other vertices. This process yields a backtracking tree that is traversed in order to explore the structure of the input graphs. Additional pruning techniques for example based on automorphisms of the input graphs make this framework feasible in practice. While the paradigm is also regularly exploited theoretically analyzing the isomorphism prob- lem (see, e.g., [9, 14, 141, 142]), among them Babai’s quasipolynomial time algorithm [11], the I/R framework is most comprehensively used in practical software tools tackling the problem. The I/R paradigm was first implemented by Brendan McKay in the early 1980’s in his software package Nauty [112] which is still one of most efficient tools for the purpose of isomorphism testing today. In the last decade, several variants of the algorithm have been implemented in other software packages such as Nauty/Traces [113], Bliss [85, 86], Conauto [105] and Saucy [35] leading to a variety of solvers that perform extremely well on an abundance of different types of instances (see, e.g., [113]). Actually, with a lack of instances challenging these solvers, the isomorphism problem already seems to be solved from a practical point of view. On the other hand, only very little is known on the worst-case complexity of the algorithms implemented by the solvers. Indeed, in his breakthrough paper [11], Babai explicitly asks for the worst-case complexity of algorithms purely based on the Individualization-Refinement paradigm. In 1995 Miyazaki proved that the then current version of Nauty [112] has exponential worst- case complexity [118]. For the proof Miyazaki designed a family of graphs that specifically target the cell selection (i.e., the color classes chosen for individualization) implemented in Nauty fooling the algorithm into an exponential behavior. However, as Miyazaki also proves, the constructed graphs can be solved in polynomial time using a slightly different cell selection. Indeed, with the heuristics for cell selection and other tasks getting more and more refined, most of the practical tools developed in the last decade perform efficiently on the graphs constructed by Miyazaki (see, e.g., [113]). In this thesis, we analyze the power of algorithms within the I/R paradigm in a much more broad setting and provide exponential lower bounds on the worst-case complexity of a large class of I/R-algorithms that include all current state-of-the-art isomorphism tools. More precisely, we present a construction yielding graphs, which we call multipede graphs, with an exponential size search tree for all I/R algorithms where the refinement operator, the cell selection and the invariants used are not stronger than the k-dimensional Weisfeiler-Leman algorithm for some fixed number k. In particular, there is no restriction on the automorphism pruning performed by the algorithm and one may even assume perfect automorphism pruning. It should be pointed out that this makes the I/R approach stronger than the k-dimensional Weisfeiler-Leman algorithm for any fixed number k. For example, isomorphism for Cai-F¨urer-Immermangraphs, which 6 CHAPTER 1. INTRODUCTION cannot be distinguished by the k-dimensional Weisfeiler-Leman algorithm unless k = Ω(n), can be tested in polynomial time within the described class of I/R algorithms [118]. For the construction of the multipede graphs we utilize a construction of Gurevich and She- lah [75] that yields rigid structures (i.e., structures without non-trivial automorphisms) with arbitrarily large Weisfeiler-Leman dimension. To be more precise, we start by constructing a bipartite base graph that is obtained from a random process. This base graph has, with high probability, strong expansion properties guaranteeing a suitable variant of the meagerness prop- erty already exploited in [75]. Additionally, almost surely, the neighborhoods of the vertices of one partition class of the bipartite graph are almost disjoint. Applying a suitable variant of the Cai-F¨urer-Immermanconstruction [31] for bipartite graphs gives the desired multipede graphs. In order to analyze the size of the search tree constructed by I/R algorithms we first define a closure operator that bounds the effect of the Weisfeiler-Leman algorithm by exploiting the almost-disjointness property of the neighborhoods. The closure operator defines a subgraph which, by the meagerness property, has an exponential number of automorphisms. This gives an exponential number of colorings that cannot be distinguished by the Weisfeiler-Leman algorithm. Combining these statements gives the desired exponential lower bound on the size of the search tree of I/R algorithms. With the I/R framework being used in all state-of-the-art isomorphism tools the above results raise the question whether the constructed graphs are also difficult in practice. Towards this end, we introduce a variant of our construction that creates the most difficult benchmark graphs for isomorphism testing available today.

Group-Theoretic Approaches A second fundamental approach to the Graph Isomorphism Problem is based on group-theoretic techniques. Already early on in the study of the isomorphism problem a close connection to the structure of the automorphism group of the input graphs was observed. In this direction, Babai [6] and Mathon [110] independently proved that the Graph Isomorphism Problem is polynomial- time equivalent to the problem of computing a generating set for the automorphism group and also computing the size of the automorphism group. From an algorithmic point of view techniques from group theory were first exploited by Babai [8] in 1979 to give an isomorphism test for graphs of bounded color class size. Already this simple algorithm demonstrates the amazing power of group-theoretic approaches since graph classes of bounded color class size include the Cai-F¨urer-Immermangraphs as well as the multipede graphs that prove to be extremely difficult for purely combinatorial approaches. While Babai’s original algorithm is a randomized Las Vegas algorithm it was derandomized shortly after by Furst, Hopcroft and Luks by providing a basic polynomial-time library for computing with permutation groups [57]. Besides its significance for the Graph Isomorphism Problem, this line of work also initiated research in Computational Group Theory (see [77, 138]). The striking usefulness of the group-theoretic techniques was further demonstrated by Luks with his polynomial time isomorphism test for graphs of bounded degree [106]. With a slight improvement given later [19] it tests in time nO(d/ log d) whether two graphs of maximum degree d are isomorphic (where n denotes the number of vertices of the input graphs). For his algorithm Luks first introduces a more general problem to be able to build a recursive algorithm along the structure of the permutation groups involved. The String Isomorphism Problem takes as input two strings x, y:Ω → Σ, where Ω is a finite set and Σ a finite alphabet, and a permutation group Γ ≤ Sym(Ω) (given by a generating set), and asks whether there is some γ ∈ Γ that maps x to y. For graphs of maximum degree d the isomorphism problem can be reduced in polynomial time [106, 21] to the String Isomorphism Problem where the input group Γ is contained in the 7

1 class Γbd containing all groups all of whose composition factors are isomorphic to subgroups of Sd. Then, the String Isomorphism Problem for Γbd-groups is solved by recursively processing the input group Γ along Γ-invariant partitions. Luks’s algorithm quickly developed to one the cornerstones of the algorithmic theory of the

Graph Isomorphism Problem. In combination with a combinatorial method due√ to Zemlyachenko [151] it results in an isomorphism test for general graphs running in time 2O( n log n) [19] which was the best known algorithm for over three decades. Also, Luks’s algorithm forms an important building block for various other algorithms tackling the isomorphism problem (see, e.g., [14, 67, 96, 117, 122]). But most notably, the recursion mechanisms introduced by Luks form the basis for Babai’s quasipolynomial time isomorphism test [11]. Indeed, for his algorithm, Babai follows Luks’s algorithm for the String Isomorphism Problem attacking the obstacle cases where the recursion performed by Luks’s algorithm does not lead to the desired running time. In order to handle these obstacle cases Babai introduces various new techniques in terms of group-theoretic methods as well as analyzing combinatorial methods such as the Weisfeiler-Leman algorithm. In this context it seems natural to ask whether the techniques developed by Babai for his quasipolynomial time algorithm can also be extended to Luks’s algorithm in order to give a faster isomorphism test for graphs of small degree. Indeed, graphs of polylogarithmic degree are not a critical case for Babai’s algorithm as the automorphism groups do not contain large alternating or symmetric groups, and graphs of polylogarithmic degree form one of the obstacle cases where Babai’ algorithm still runs in quasipolynomial time. This gives a strong motivation to investigate the above question. In this thesis we provide a positive answer and prove that the isomorphism problem for graphs of maximum degree d can be solved in time npolylog(d). Actually, following the standard route of considering the String Isomorphism Problem, we present an algorithm that polylog(d) solves the String Isomorphism Problem for Γbd-groups in time n . For designing the algorithm a main hurdle is to adapt the group-theoretic techniques devel- oped in [11] to the setting of Γbd-groups. Towards this end, we introduce the notion of an almost d-ary sequence of partitions for a permutation group Γ ≤ Sym(Ω). Consider a sequence of Γ- invariant partitions B0 = {Ω} B1 · · · Bm = {{α} |∈ Ω} where Bi ≺ Bi−1 means that the partition Bi strictly refines Bi−1. Such a sequence is almost d-ary if, for every i ∈ [m] and B ∈ Bi−1, it holds that, after stabilizing B setwise, the induced action of Γ on the classes from Bi contained in B is permutationally isomorphic to a subgroup of Sd or semi-regular (i.e., only the identity element has fixed points). For permutation groups with an almost d-ary sequence of partitions there is a natural adaption of Babai’s Unaffected Stabilizers Theorem which lays the foundation for the group-theoretic techniques developed in [11]. With this, it is possible to give a variant of the Local Certificates Routine which, by a more refined analysis of the running time, allows the efficient construction of relational structures defined on a most d points cap- turing sufficient structural information of the input strings. Computing isomorphisms between the relational structures using Babai’s algorithm [11] as a black box allows us to make sufficient progress in order to build a recursive algorithm with the desired running time. However, not every permutation group in the class Γbd has an almost d-ary sequence of par- titions required by the approach described above. To remedy this, our algorithm first performs a normalization of the input data by modifying the action of the input group while preserving string isomorphisms. This normalization process is based on some heavy group theory. The first step is to classify large primitive Γbd-groups via the O’Nan Scott Theorem exploiting several group-theoretic structure theorems on primitive groups in Γbd showing that such groups are nec- essarily composed of Johnson schemes in a well-defined manner. Based on this classification we are able to construct graphs of small degree describing the structure of the input permutation

1 In Luks’s original work [106] this class is called Γd. However, in the more recent literature [13, 58] the class Γd typically refers to larger class of groups. 8 CHAPTER 1. INTRODUCTION group in a suitable way. Finally, unfolding these graphs yields the desired normalized group operation. With Luks’s algorithm for the String Isomorphism Problem being used as a subroutine in various other algorithms one can ask for the impact of the above improvement in the context of the Graph Isomorphism Problem. Of course, the first application is an improved isomorphism test for graphs of maximum degree d running in time npolylog(d). Moreover, one can also give better isomorphism tests for relational structures and hypergraphs of small degree. Actually, these results are not only interesting when the degree is small, but even improve on existing algorithms for isomorphism testing of relational structures and hypergraphs in general (cf. [15]). For a deeper application of the above results we consider the isomorphism problem for graphs of bounded tree-width. The first polynomial time algorithm for graph classes of bounded tree-width was given by Bodlaender [27] using dynamic programming on the set of all k-tuples of vertices separating the graph resulting in a running time of nO(k). This roughly matches the running time of an isomorphism test based on the Weisfeiler-Leman algorithm using a result of Grohe and Mari˜no [66] upper-bounding the Weisfeiler-Leman dimension of such graphs. Only recently, Lokshtanov, Pilipczuk, Pilipczuk, and Saurabh [104] designed the first fixed-parameter tractable isomorphism 5 test parameterized by the tree-width of the input graphs running in time 2O(k log k)nc for some constant c. The algorithm of Lokshtanov et al. first improves the input graphs and decomposes the improved graphs along clique separators into so-called basic parts in an isomorphism-invariant manner. After fixing a vertex of small degree, the basic parts can be decomposed further into an isomorphism-invariant tree decomposition of width exponential in k and adhesion width O(k3). Using dynamic programming this suffices to compute a graph canonization in the desired time frame (see also [128]). In this thesis we give an improved isomorphism test running in time 2O(k polylog(k))nc for some constant c. The main addition in comparison to the algorithm of Lokshtanov et al. is that each bag of the decomposition of the basic graphs is labeled with an auxiliary graph capturing structural information obtained during the decomposition. Crucially, these structure graphs can be designed to have small degree which allows us to apply the methods described above for the isomorphism problem of graphs of bounded degree. In combination with other modifications this enables us to improve the running time as indicated above.

Scientific Contribution The following section displays the scientific contributions of the author to the result presented in this thesis. The upper and lower bounds on the Weisfeiler-Leman dimension for graph classes of bounded tree-width were obtained in a joint work with Sandra Kiefer “The Power of the Weis- feiler Leman Algorithm to Decompose Graphs” published at the 44th International Symposium on Mathematical Foundations of Computer Science (MFCS 2019) [89]. The main technical result of this paper is that the 2-dimensional Weisfeiler-Leman algorithm distinguishes 2-separators from other pairs of vertices. The bounds on the Weisfeiler-Leman dimension for graphs of bounded tree-width, which partly follow from the above result, were, to a large part, obtained and for- malized by the present author. The upper bound on the Weisfeiler-Leman dimension of graphs of bounded rank-width was obtained solely by the author of this thesis and presented in “Canonisation and Definability for Graphs of Bounded Rank Width” published in the proceedings of the Thirty-Fourth Annual ACM/IEEE Symposium on Logic in Computer Science (LICS 2019) [68]. This paper is a joint work with Martin Grohe also featuring a second definability result that was obtained together with Martin Grohe. 9

The lower bounds on the running time of I/R algorithms were obtained in a joint project with Pascal Schweitzer [123, 124]. The theoretical bounds on the worst-case complexity of I/R algo- rithms appeared as “An Exponential Lower Bound for Individualization-Refinement Algorithms for Graph Isomorphism” in the proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing (STOC 2018). On the other hand, practical benchmark graphs are con- structed and evaluated in “Benchmark Graphs for Practical Graph Isomorphism” presented at 25th Annual European Symposium on Algorithms (ESA 2017). For the practical part, the main contribution of this author lies in evaluating the performance of practical solvers on a number of different constructions based on the Cai-F¨urer-Immermangraphs [31] and multipede graphs [75] and in tuning the constructions in order to obtain graphs that are practically extremely difficult. The success on the practical side motivated research on the theoretical worst-case com- plexity of algorithms within the I/R paradigm. The main insights allowing to provide theoretical lower bounds were obtained jointly with Pascal Schweitzer in various discussions regarding this problem. The faster algorithm for the String Isomorphism Problem for Γbd-groups and, consequently for graphs of bounded degree is published as “A Faster Isomorphism Test for Graphs of Small Degree” in the proceedings of the 59th IEEE Annual Symposium on Foundations of Computer Science (FOCS 2018) [69]. For this result, the adaption of Babai’s methods for Γbd-groups equipped with an almost d-ary sequence of partitions is completely due to this author. The ideas for the normalization procedure were developed mostly together with Martin Grohe where the technical write-up was done by the present author. I remark that the normalization procedure given in this thesis is different from the original one in the sense that it is less technically involved and to some degree provides a more general framework for modifying the action of a permutation group in a certain way. The normalization is based on a characterization theorem for primitive Γbd-groups which is mostly based on the available literature on primitive groups. The search for the literature was mostly done by this author again whereas the writeup and various technical details were resolved jointly with Pascal Schweitzer. Finally, the improved fixed-parameter tractable isomorphism test for graphs parameterized by tree-width is a joint work together with Martin Grohe, Pascal Schweitzer and Daniel Wiebking [70]. The paper appeared as “An Improved Isomorphism Test for Bounded-Tree-Width Graphs” in the proceedings of the 45th International Colloquium on Automata, Languages, and Pro- gramming (ICALP 2018). For this work, the main contribution of the present author lies in constructing and exploiting the structure graphs, that each bag is labeled with and which cap- ture the bounded-degree structures present in the bag in order to be able to apply the novel methods developed for bounded degree graphs. I remark that the observation that the automor- phism group of a k-basic graph, after fixing a vertex of degree at most k, lies in the class Γbk+1 is due to Martin Grohe, Pascal Schweitzer and Daniel Wiebking which, however, alone is not sufficient in order to apply the methods developed for graphs of bounded degree.

Structure of the Thesis The remainder of this thesis is structured as follows. In Chapter 2 we introduce the basic terminology for this work including a complete description of the Weisfeiler-Leman algorithm and the Individualization-Refinement paradigm. The upper bounds on the Weisfeiler-Leman dimension of graphs of bounded tree-width and rank-width are presented in Chapter 3. These results are complemented by the corresponding lower bounds in Section 4.1. Moreover, the lower bounds on the worst-case complexity of algorithms within the I/R framework are given in Section 4.2 and 4.3. This concludes the first part of this thesis on the power of purely combinatorial approaches to the Graph Isomorphism Problem. 10 CHAPTER 1. INTRODUCTION

For the second part, which deals with the power of group-theoretic approaches, Chapter 5 first introduces the necessary background on group theory as well as Luks’s algorithm and a short introduction to some aspects of Babai’s quasipolynomial time algorithm. In Chapter 6 we present a faster algorithm performing isomorphism tests for graphs of bounded degree adapting the group-theoretic techniques of Babai’s algorithm. In Chapter 7 these results are further applied to also obtain an improved fixed-parameter tractable isomorphism test parameterized by the tree-width of the input graphs. Finally, this thesis is concluded with a discussion of the results and open research directions in Chapter 8. Chapter 2

Isomorphism and Combinatorial Algorithms

2.1 Graphs and Isomorphism

2.1.1 Graphs and Notation

Notation for Numbers and Sets. The set of natural numbers is N = {1, 2, 3, 4,... } and [n] := {1, . . . , n} denotes the initial segment of the natural numbers up to n. Also, Z = {0, 1, −1, 2, −2,... } denotes the set of integers and Z≥0 = {0, 1, 2, 3,... } denotes the set of non-negative integers. For a finite set X the power set of X is denoted by 2X := {Y | Y ⊆ X}. X |X| X Note that |2 | = 2 . Also, for t ≤ |X|, the set of all t-element subsets of X is denoted t := X |X| X  {Y ⊆ X | |Y | = t}. Again, observe that | t | = t . Similarly, ≤t := {Y ⊆ X | |Y | ≤ t} denotes the set of all subsets of X of size at most t. For three sets X,X1,X2, the set X is the disjoint union of X1 and X2, denoted by X = X1 ] X2, if X = X1 ∪ X2 and X1 ∩ X2 = ∅. For a finite set X a partition of X is a collection B of subsets such that B1 ∩ B2 = ∅ for all B1,B2 ∈ B S and B∈B B = X. A partition B is called an equipartition if |B1| = |B2| for all B1,B2 ∈ B. For S ⊆ X define B[S] := {B ∩ S | B ∈ B: B ∩ S 6= ∅} to be the induced partition on S. A partition B1 of a set X refines another partition B2 of X, denoted B1  B2, if for every B1 ∈ B1 there is some B2 ∈ B2 such that B1 ⊆ B2. If additionally B1 6= B2 we say that B1 strictly refines B2 which is denoted by B1 ≺ B2.

Graphs. A graph is a pair G = (V (G),E(G)) with vertex set V (G) and edge set E(G). Unless stated otherwise, all graphs are undirected and simple graphs, i.e., there are no loops or multiedges. In this setting an edge is denoted as vw where v, w ∈ V (G). The (open) neighborhood of a vertex v is denoted NG(v) := {w ∈ V (G) | vw ∈ E(G)}. The closed neighborhood is NG[v] := NG(v) ∪ {v}. Also, for a set of vertices X ⊆ V (G), the neighborhood of X is defined as S  NG(X) := v∈X NG(v) \ X. The degree of a vertex v ∈ V (G), denoted degG(v), is the size of its neighborhood. Usually, we omit the index G if it is clear from the context and simply write N(v), N[v], N(X) and deg(v). Let v, w ∈ V (G). A path from v to w is a sequence of pairwise distinct vertices v = u0, u1, . . . , u`−1, u` = w such that ui−1ui ∈ E(G) for all i ∈ [`]. In this case ` is the length of the path. The distance between v and w, denoted distG(v, w), is the length of a shortest path from v to w. As before, the index G is usually omitted. For X ⊆ V (G) the induced subgraph on X is G[X] := (X, {vw | v, w ∈ X, vw ∈ E(G)}). Also, G−X := G[V (G)\X]

11 12 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS

denotes the induced subgraph on the complement of X. A (vertex-)colored graph is a tuple G = (V (G),E(G), χG) where χG : V (G) → C is a mapping and C is some finite set of colors. For ease of notation I also often write (G, χG) in order to explicitly refer to the coloring χG. In this thesis, all graphs may be seen as colored graphs. Note that an uncolored graph can be interpreted as a colored graph where each vertex is assigned the −1 same color. The color classes of a (colored) graph are the sets χG (c) where c ∈ C. Note that the color classes form a partition of the vertex set. A coloring χG is discrete if all color classes are singletons, i.e., χG(v) 6= χG(w) for all distinct v, w ∈ V (G).

Relational Structures. A (relational) vocabulary τ is a set of relation symbols R1,...,Rk where each symbol is equipped with an arity ri ∈ N.A relational structure (over vocabulary τ) is ri a tuple A = (V,R1(A),...,Rk(A)) with a (finite) universe V and relations Ri(A) ⊆ V , i ∈ [k]. Usually, I omit the vocabulary and simply write A = (V,R1,...,Rk) for a relational structure over an implicitly given vocabulary. The structure A is t-ary if every relation symbol has arity at most t, i.e., ri ≤ t for all i ∈ [k].

2.1.2 Tree Decompositions and Tree-Width Throughout this thesis tree decompositions and the tree-width of a graph repeatedly play a role. In the following these concepts are formally defined and some very basic properties are stated which are used in the course of this thesis (for a more complete introduction to tree-width I refer to [94]).

Definition 2.1.1 (Tree Decomposition and Tree-Width). Let G be a graph. A tree decomposition is a pair (T, β) where T is a tree and β : V (T ) → 2V (G) such that

(T.1) for every v ∈ V (G) the set {t ∈ V (T ) | v ∈ β(t)} is non-empty and induces a connected subgraph in T , and

(T.2) for every e ∈ E(G) there is some t ∈ V (T ) such that e ⊆ β(t).

The sets β(t), t ∈ V (T ), are called the bags of the decomposition. The width of a decomposition (T, β) is width(T, β) := max |β(t)| − 1. t∈V (T ) The tree-width of G is the minimum width among all tree decompositions of G, i.e.,

tw(G) := min{width(T, β) | (T, β) is a tree decomposition of G}.

The adhesion sets of a tree decomposition (T, β) are the sets β(s)∩β(t) for edges st ∈ E(T ). The adhesion-width of a decomposition is the maximum size of an adhesion set, i.e., maxst∈E(T ) |β(s)∩ β(t)|.

The following very basic properties of tree decompositions are well-known.

Lemma 2.1.2. Let G be a graph and let C ⊆ V (G) be a clique in G (i.e., vw ∈ E(G) for all distinct v, w ∈ C). Also let (T, β) be a tree decomposition of G. Then there exists t ∈ V (T ) such that C ⊆ β(t).

Let k ∈ N. A graph G is k-degenerate if every subgraph of G contains a vertex of degree at most k. 2.1. GRAPHS AND ISOMORPHISM 13

Lemma 2.1.3. Let G be a graph such that tw(G) ≤ k. Then G is k-degenerate. In particular, there exists a vertex v ∈ V (G) such that deg(v) ≤ k. Let G be a graph. A set S ⊆ V (G) is a separator of G if G−S has more connected components than the graph G. In particular, if G is a connected graph then S is a separator if and only if G − S is disconnected. Now let (T, β) be a tree decomposition of the graph G and suppose G is connected. Then the adhesion set β(s) ∩ β(t) is a separator of G for every edge st ∈ E(T ) such that β(s) * β(t) and β(t) * β(s). (Observe that every tree decomposition (T, β) of a graph G can easily be turned into a tree decomposition (T 0, β0) of the same width such that β0(s) * β0(t) and β0(t) * β0(s) for all st ∈ E(T 0).)

2.1.3 Isomorphisms Two graphs G and H are isomorphic, denoted by G =∼ H, if there is a bijective mapping ϕ: V (G) → V (H) such that vw ∈ E(G) if and only if ϕ(v)ϕ(w) ∈ E(H) for all v, w ∈ V (G). In this case ϕ is an isomorphism from G to H which is denoted by ϕ: G =∼ H. For two graphs G and H let Iso(G, H) := {ϕ: V (G) → V (H) | ϕ: G =∼ H} denote the set of all isomorphisms from G to H. Also, for every set Λ ⊆ {ϕ: V (G) → V (H) | ϕ bijective} define IsoΛ(G, H) := {ϕ ∈ Λ | ϕ: G =∼ H}. The Graph Isomorphism Problem asks, given two graphs G and H, whether they are isomorphic, i.e., whether Iso(G, H) 6= ∅. While the Graph Isomorphism Problem is typically defined for uncolored graphs there is a natural variant for colored graphs. Two colored graphs (G, χG) and (H, χH ) are isomorphic if there is an isomorphism ϕ: G =∼ H (between the uncolored graphs) such that additionally χG(v) = χH (ϕ(v)) for all v ∈ V (G). The Graph Isomorphism Problem for colored graphs asks, given two colored graphs (G, χG) and (H, χH ), whether they are isomorphic. Since these two problems are polynomial-time equivalent under many-one reductions (see, e.g., [28]) I do not distinguish them in the remainder of this thesis and, consistent with previous provisions, typically refer to the isomorphism problem for colored graphs. We regularly also encounter graphs (G, χG, v1, . . . , vk) which are additionally equipped with a sequence of vertices v1, . . . , vk ∈ V (G). In this context we say that v1, . . . , vk ∈ V (G) are individualized. Two such graphs (G, χG, v1, . . . , vk) and (H, χH , w1, . . . , w`) are isomorphic if ∼ there is an isomorphism ϕ:(G, χG) = (H, χH ) such that additionally k = ` and ϕ(vi) = wi for all i ∈ [k]. Actually, in the context of isomorphism testing, a graph (G, χG, v1, . . . , vk) may be interpreted as a colored graph where the coloring χG is modified in such a way that each vi forms a singleton color class, i ∈ [k]. More formally, suppose χG : V (G) → C and define ∗ [k] χG : V (G) → C × 2 : v 7→ (χG(v), {i ∈ [k] | v = vi}). ∗ Then (G, χG, v1, . . . , vk) may be interpreted as the colored graph (G, χG). In particular, this al- lows to apply methods defined for colored graphs also to graphs with a sequence of individualized ∼ ∗ ∼ ∗ vertices. Note that (G, χG, v1, . . . , vk) = (H, χH , w1, . . . , w`) if and only if (G, χG) = (H, χH ). Two relational structures A1 = (V1,R1,...,Rk) and A2 = (V2,S2,...,Sk) (over the same vocabulary) are isomorphic if there is a bijective mapping ϕ: V1 → V2 such that (v1, . . . , vri ) ∈ Ri if and only if (ϕ(v1), . . . , ϕ(vri )) ∈ Si (where ri denotes the arity of Ri and Si). As before, ∼ ϕ: A1 = A2 denotes that ϕ is an isomorphism from A1 to A2. In the course of this thesis we repeatedly need to deal with more general types of objects for which we are interested in the set of isomorphisms. While the type of considered objects typically depends on the specific application one can still define isomorphisms within a more general framework1. Let V be finite set of elements (e.g. the vertex of a graph). The set of

1This framework is for example also used in [137] in order to describe a general method for canonizing different types of objects. 14 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS hereditarily finite objects over ground set V is inductively defined as follows. Each v ∈ V is an atom which in particular is a hereditarily finite object. Also, for hereditarily finite objects X1,...,Xk (over ground set V ) the set {X1,...,Xk} as well as the tuple (X1,...,,Xk) is a hereditarily finite object. Observe that every type of object described above is a hereditarily finite object over a suitable ground set V . In order to define isomorphisms for hereditarily finite objects let X be a hereditarily finite object over ground set V and Y be a hereditarily finite object over ground set W . An isomorphism from X to Y is a bijective mapping ϕ: V → W such that Xϕ = Y where Xϕ is inductively defined as follows. For every atom v ∈ V define ϕ ϕ ϕ ϕ v = ϕ(v). For X = {X1,...,Xk} let X = {X1 ,...,Xk } and similarly, for X = (X1,...,Xk) ϕ ϕ ϕ let X = (X1 ,...,Xk ). Note that the definition of isomorphisms for hereditarily finite objects is consistent with the definitions given before. Let G be a (colored) graph. An automorphism of G is an isomorphism from G to itself. Let Aut(G) denote the automorphism group of G, i.e., the set of all automorphisms of G to- gether with the composition operation. Observe that, for a second (colored) graph H, either Iso(G, H) = ∅ or Iso(G, H) = {γσ | γ ∈ Aut(G)} =: Aut(G)σ where σ ∈ Iso(G, H) is an arbi- trary isomorphism (where compositions of functions are applied from left to right). A graph is rigid if its automorphism group is trivial, i.e., the only automorphism is the identity mapping. Of course, these definitions naturally lift to relational structures and, more generally, hereditarily finite objects.

2.1.4 Isomorphism Invariance and Canonization A typical situation in the design of algorithms for the Graph Isomorphism Problem is that, in intermediate steps of the algorithm, more complicated structures are computed which are utilized to decide the isomorphism problem for the original input. For this approach to be correct, it is often crucial that the constructed objects are defined in an isomorphism-invariant way, i.e., each isomorphism of the input structures naturally extends to an isomorphism of the structures constructed in intermediate steps. Since the notion of isomorphism-invariance is used for various types of structures in this thesis, it is formally defined based on hereditarily finite objects. Let X be a class (which is closed under isomorphisms) of pairs (V,X) where X is a hereditarily finite object over ground set V . Similarly, let Y be another class of pairs (W, Y ) where again Y is a hereditarily finite object over ground set W . A function f : X → Y is isomorphism-invariant if for every two (V1,X2), (V2,X2) ∈ X and every isomorphism ϕ ∈ Iso((V1,X1), (V2,X2)) there is an isomorphism ψ from (W1,Y1) := f(V1,X1) to (W2,Y2) := f(V2,X2) such that ϕ(v) = ψ(v) ϕ for every v ∈ V1 ∩ W1 and (V1 ∩ W1) = V2 ∩ W2.

Example 2.1.4. Let G be a graph and suppose (T, β) is a tree decomposition for G. Without loss of generality assume that V (G) ∩ V (T ) = ∅. Then G together with its tree decomposition may be viewed as a hereditarily finite object

X = (V (G),E(G),V (T ),E(T ), {(t, β(t)) | t ∈ V (T )}) over ground set W = V (G)]V (T ). Also a function, that associates a tree decomposition with each graph, is isomorphism invariant if for all graphs G1 and G2 and associated tree decompositions T1 and T2, and every ϕ ∈ Iso(G1,G2), there is a bijection ψ : V (G1) ] V (T1) → V (G2) ] V (T2) such that ψ(v) = ϕ(v) for all v ∈ V (G1), ψ|T1 is an isomorphism from T1 to T2, and β2(ψ(t)) = ψ(β1(t)) for all t ∈ V (T1). Let X1 and X2 be the hereditarily finite objects associated with G1 and G2 together with their respective tree decompositions. Then the above conditions translate to ψ being an isomorphism from X1 to X2. 2.1. GRAPHS AND ISOMORPHISM 15

An important special case of isomorphism-invariant functions are graph invariants. Indeed, a simple, but typical approach to the Graph Isomorphism Problem is to extract certain structural information from a graph. If the extracted information differ for two given input graphs they must be non-isomorphic. This approach is formalized in terms of graph invariants. Formally, let C be a class of graphs (i.e., a collection a graphs that is closed under isomorphisms). A graph invariant for C is a function I with domain C such that I(G) = I(H) for all graphs G, H ∈ C such that G =∼ H. In particular, the function I is isomorphism-invariant. For two given graphs G and H, graph invariants can be used as a heuristic for isomorphism testing by computing I(G) and I(H) and comparing the results for equality. If I(G) 6= I(H) then the input graphs must be non-isomorphic (in the other case one can not deduce any information about the graphs being isomorphic). An important example of a graph invariant builds on the k-dimensional Weisfeiler-Leman algorithm to be introduced in the next section. A graph invariant I for a class C is complete if I(G) = I(H) if and only if G =∼ H, i.e., the above algorithm serves as a complete isomorphism test. An important special case of complete graph invariants are graph canonizations where the output of the function is again a graph with an ordered vertex set. Towards this end, let GN denote the class of graphs whose vertex set is an initial segment of the natural numbers.

Definition 2.1.5. A function κ: C → GN canonizes a graph class C (resp. κ is a graph canon- ization for C) if

(C.1) κ(G) =∼ G for all G ∈ C, and

(C.2) for all G, H ∈ C it holds that

G =∼ H ⇒ κ(G) = κ(H).

Note that the backward direction of the implication in (C.2) also holds by Property (C.1). Hence, the Graph Isomorphism Problem reduces to the problem of computing a graph canon- ization. It is unknown whether the converse also holds, i.e., whether the problem of computing a graph canonization polynomial-time reduces to the Graph Isomorphism Problem.

Example 2.1.6 (Lexicographically Smallest Representation). For a graph G ∈ GN with G = 2 ([n],E(G)) consider the string sG : {(i, j) ∈ [n] | i > j} → {0, 1} defined by sG(i, j) = 1 if and only if ij ∈ E(G). There is a natural order on the positions {(i, j) ∈ [n]2 | i > j} by first comparing the first entry of the tuple and afterwards the second entry. This allows to compare graphs G, H ∈ GN by comparing the strings sG and sH using the standard lexicographic order ≤lex on strings. Consider the function κlex : G → GN defined by κlex(G) = F for a graph F ∈ GN such that F =∼ G and sF ≤lex sH ∼ for all H ∈ GN with G = H. It is easy to show that κlex canonizes the class of all graphs G. However, the function κlex cannot be computed in polynomial time (unless PTIME = NP). Indeed, in order to find a numbering of the vertices of G that minimizes the string representation, an algorithm needs to compute a maximum independent set of the graph. More precisely, consider the graph Dn,` = ([n],En,`) with

En,` = {ij | i > ` ∨ j > `}.

Then sκlex(G) ≤lex sDn,` if and only if G has an independent set of size ` (where n is the number of vertices of G). 16 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS

Of course, this is only one example of a graph canonization and there may be other mappings canonizing the class of all graphs which can be evaluated more efficiently. Indeed, in a very recent work, Babai gives a graph canonization that can be computed in quasipolynomial time [12] which generalizes his previous result [11] giving an isomorphism test that runs within the same time bound.

2.2 The Weisfeiler-Leman Algorithm

One of the most fundamental subroutines in the context of the Graph Isomorphism Problem is the Weisfeiler-Leman algorithm. Originally, the algorithm was introduced in its 2-dimensional variant in 1968 by Weisfeiler and Leman [149] (see also [148]). The generalized version for arbitrary dimension k was independently defined by Babai and Mathon [7] and Immerman and Lander [82]. While the algorithm itself cannot be used to decide graph isomorphism for the class of all graphs [31] the method is commonly used as subroutine for designing isomorphism tests. A prominent example is Babai’s quasipolynomial time algorithm [11] which crucially employs the Weisfeiler-Leman algorithm for dimension k = O(log n) (where n denotes the number of vertices of the input graphs).

2.2.1 The Algorithm

Let G be a graph with vertex coloring χG : V (G) → [`] and let k ∈ N. The k-dimensional Weisfeiler-Leman algorithm is a procedure that, given a colored graph G, first computes an isomorphism-invariant initial coloring of the k-tuples of vertices and then iteratively refines this coloring in an isomorphism-invariant way. k Let χ1, χ2 : V → C be colorings of k-tuples of vertices where C is some finite set of colors. The coloring χ1 refines χ2, denoted χ1  χ2, if χ1(¯v) = χ1(w ¯) implies χ2(¯v) = χ2(w ¯) for all k v,¯ w¯ ∈ V . Observe that χ1  χ2 if and only if the partition into color classes of χ1 refines the corresponding partition into color classes of χ2. The colorings χ1 and χ2 are equivalent, denoted χ1 ≡ χ2, if χ1  χ2 and χ2  χ1. k For the description of the Weisfeiler-Leman algorithm fix k ≥ 1. The initial coloring χ(0)[G] computed by the Weisfeiler-Leman determines for each k-tuplev ¯ ∈ V (G)k the isomorphism- k type of the underlying ordered induced subgraph. More precisely, it holds χ(0)[G](v1, . . . , vk) = k χ(0)[G](w1, . . . , wk) if for all i, j ∈ [k] it holds χG(vi) = χG(wi), vi = vj if and only if wi = wj, and vivj ∈ E(G) if and only if wiwj ∈ E(G). The initial coloring is refined by iteratively k computing colorings χ(i)[G] for i > 0. Forv ¯ = (v1, . . . , vk) and w ∈ V (G) letv ¯[i/w] := (v1, . . . , vi−1, w, vi+1, . . . , vk) be the tuple obtained fromv ¯ by replacing the i-th entry with vertex k k w. For k > 1 define χ(i)[G](¯v) = (χ(i−1)[G](¯v), M) where nn  oo k k M = χ(i−1)[G](¯v[1/w]), . . . , χ(i−1)[G](¯v[k/w]) w ∈ V (G) . For k = 1 the definition is essentially the same, but the multiset is defined only over the neighbors k of v, i.e., M = {{χ(i−1)[G](w) | w ∈ NG(v1)}}. From the definition of the colorings it is k k immediately clear that χ(i+1)[G]  χ(i)[G]. Now let i ∈ N be the minimal number such that k k k χ(i)[G] ≡ χ(i+1)[G]. For this i, the coloring χ(i)[G] is called the stable coloring of G and is k denoted by χWL[G]. The k-dimensional Weisfeiler-Leman algorithm takes as input a (colored) graph G and com- k putes (a coloring that is equivalent to) χWL[G]. For every fixed k ∈ N this can be done in polynomial time. 2.2. THE WEISFEILER-LEMAN ALGORITHM 17

Theorem 2.2.1 (see [82]). Let G be a graph. Then an isomorphism-invariant coloring that is k k+1 equivalent to χWL[G] can be computed in time O(n log n).

For the 1-dimensional Weisfeiler-Leman algorithm, which is also referred to as Color Refine- ment algorithm, the running can be improved to almost linear in the number of vertices and edges (see, e.g., [24]).

Theorem 2.2.2. Let G be a graph. Then an isomorphism-invariant coloring that is equivalent 1 to χWL[G] can be computed in time O((n + m) log n) where n denotes the number of vertices and m the number of edges of G.

A common application is to use the Weisfeiler-Leman algorithm as an (incomplete) isomor- phism test. The k-dimensional Weisfeiler-Leman algorithm distinguishes two graph G and H if there is a color c such that

k k k k |{v¯ ∈ V (G) | χWL[G ] H](¯v) = c}|= 6 |{w¯ ∈ V (H) | χWL[G ] H](w ¯) = c}| where G ] H denotes the disjoint union of the graphs G and H. If the k-dimensional Weisfeiler- Leman algorithm distinguishes G and H then G =6∼ H. Two graphs G and H are equivalent with respect to k-dimensional Weisfeiler-Leman, denoted G 'k H, if they are not distinguished by the k-dimensional Weisfeiler-Leman algorithm. Note that, in general, one can not conclude that G =∼ H in this case. A graph G is identified by the k-dimensional Weisfeiler-Leman algorithm ∼ if G 'k H if and only if G = H for all graphs H. Following Grohe [64], the Weisfeiler-Leman dimension of a graph G, denoted dimWL(G), is the smallest number k ∈ N such that the k- dimensional Weisfeiler-Leman algorithm identifies G.

Observation 2.2.3. For every graph G it holds that dimWL(G) ≤ |V (G)| − 1.

Let C be a class of graphs. The Weisfeiler-Leman dimension of C is the smallest number 2 ` ∈ N ∪ {∞} such that dimWL(G, χG) ≤ ` for all colored graphs (G, χG) such that G ∈ C . Note that if dimWL(C) is finite then the Graph Isomorphism Problem for C can be solved in polynomial time. Indeed, for many important graph classes it is known that their Weisfeiler- Leman dimension is finite. This includes for example planar graphs [61, 90], graphs of bounded tree-width [66], and more generally every graph class that excludes a fixed graph as a minor [63] (see also [64]). On the other hand, it is also known that the class of all graphs has infinite Weisfeiler-Leman dimension [31]. Both topics are further discussed in Chapter 3 and 4. The Weisfeiler-Leman algorithm can also be used to tackle the Canonization Problem. While the coloring computed by the Weisfeiler-Leman algorithm can not be used directly to compute a graph canonization, the algorithm can be used as a subroutine to design a canonization algorithm. The basic idea is to utilize an ordering on the colors computed by the algorithm. To formalize this idea we first need to introduce another concept. Let G be a graph. The k-dimensional Weisfeiler-Leman algorithm determines orbits of G, k if for every v ∈ V (G), every graph H and every w ∈ V (H) such that χWL[G](v, . . . , v) = k ∼ χWL[H](w, . . . , w) there is an isomorphism ϕ: G = H such that ϕ(v) = w.

Theorem 2.2.4. Let C be a class of graphs of Weisfeiler-Leman dimension at most k. Then the (k + 1)-dimensional Weisfeiler-Leman algorithm determines orbits of all graphs G ∈ C.

2This definition slightly deviates from the standard definition which usually considers only uncolored graphs. However, the present definition is often more convenient. For example, there is a generic polynomial-time canon- ization algorithm for graph classes of bounded Weisfeiler-Leman dimension. 18 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS

Proof. Let G ∈ C and let H be a second graph. Also let v ∈ V (G) and w ∈ V (H) such k+1 k+1 that χWL [G](v, . . . , v) = χWL [H](w, . . . , w). Then (G, v) 'k (H, w). Since the k-dimensional Weisfeiler-Leman identifies all graphs G ∈ C this implies that (G, v) =∼ (H, w). So there is an isomorphism ϕ: G =∼ H such that ϕ(v) = w.

Theorem 2.2.5. Let C be a class of graphs of Weisfeiler-Leman dimension at most k. Then there is an algorithm canonizing graphs G ∈ C in time O(nk+3 log n).

Algorithm 1: Canonization Algorithm for graph class C

Input : Graph G ∈ C with vertex coloring χG Output: κ(G)

1 n := |V (G)| 2 for i = 1, . . . , n do k+1 3 compute χG,i := χWL [G, χG, v1, . . . , vi−1] 4 vi := argminv∈V (G)\{v1,...,vi−1} χG,i(v) 5 end 6 return ([n], {ij | vivj ∈ E(G)}, i 7→ χG(vi))

Proof. Let κ: C → GN be the function computed by Algorithm 1. It is first argued that κ canonizes the graph class C. Let G ∈ C. Clearly, ϕ: V (G) → [n]: vi 7→ i is an isomorphism from G to κ(G). ∼ So let H ∈ C be a second graph such that G = H. Also let v1, . . . , vn be the sequence of vertices computed by Algorithm 1 for the graph G and let w1, . . . , wn be the corresponding ∼ sequence for H. It is proved by induction in i ∈ {0, . . . , n} that (G, v1, . . . , vi) = (H, w1, . . . , wi). ∼ ∼ The base step i = 0 is exactly the assumption G = H. So let i ≥ 1 and let ϕ:(G, v1, . . . , vi−1) = ∼ (H, w1, . . . , wi−1). Then (G, χG,i) = (H, χH,i) and χG,i(vi) = χH,i(wi). Since the (k + 1)- dimensional Weisfeiler-Leman algorithm determines orbits of all graphs G ∈ C by Theorem ∼ 2.2.4 there is an isomorphism ϕ:(G, χG,i) = (H, χH,i) such that ϕ(vi) = wi. But this means ∼ (G, v1, . . . , vi) = (H, w1, . . . , wi). By the induction principle, ϕ: V (G) → V (H): vi 7→ wi is an isomorphism from G to H. Thus, κ(G) = κ(H). The bound on the running time is immediately clear as the algorithm performs n calls to the (k + 1)-dimensional Weisfeiler-Leman algorithm, which runs in time O(nk+2 log n) (see Theorem 2.2.1).

The Weisfeiler-Leman procedure has various equivalent characterizations that connect the algorithm to other ares of computer science. The expressive power of the algorithm can be characterized in terms of bounded variable fragments of first order logic with counting quantifiers and in terms of certain pebble games (see [82, 31]). Also, more recently, it has been observed that there are correspondences to Sherali-Adams relaxations of certain linear programs [5, 71] and the strength of the algorithm can be characterized by certain homomorphism counts from graphs of bounded tree-width [44]. Moreover, in the last couple of years, the algorithm has been exploited in a graph learning context [139] (see also [97, 125]). In this context, the power of the Color Refinement algorithm can be characterized by the power of certain graph neural networks [120] and extensions of the algorithms have been proposed based on higher dimensions of the Weisfeiler-Leman algorithm (see, e.g., [119, 120]). 2.2. THE WEISFEILER-LEMAN ALGORITHM 19

2.2.2 Pebble Games For this thesis a characterization by the Weisfeiler-Leman algorithm in terms of pebble games is most relevant. Indeed, in many cases, it is much easier to argue that two graphs are distinguished by the Weisfeiler-Leman algorithm exploiting the characterization by pebble games. Let k ∈ N. For two colored graphs (G, χG) and (H, χH ) on the same number of vertices the bijective k-pebble game BPk(G, χG, H, χH ) is defined as follows: • The game has two players called Spoiler and Duplicator. • The game proceeds in rounds. Each round is associated with a pair of positions (¯v, w¯) withv ¯ ∈ V (G)` andw ¯ ∈ V (H)` where 0 ≤ ` ≤ k. • The initial position of the game is ((), ()) (the pair of empty tuples). • Each round consists of the following steps. Suppose the current position of the game is (¯v, w¯) = ((v1, . . . , v`), (w1, . . . , w`)). First, Spoiler chooses whether to remove a pair of pebbles or to play a new pair of pebbles. The first option is only possible if ` > 0 and the latter option is only possible if ` < k. If Spoiler wishes to remove a pair of pebbles he picks some i ∈ [`] and the game moves to position (¯v \ i, w¯ \ i) wherev ¯ \ i := (v1, . . . , vi−1, vi+1, . . . , v`) (w ¯ \ i is defined in the same way). Otherwise the following steps are performed. (D) Duplicator picks a bijection f : V (G) → V (H). (S) Spoiler chooses v ∈ V (G) and sets w := f(v).

The new position is then ((v1, . . . , v`, v), (w1, . . . , w`, w)).

Spoiler wins the play if for the current position ((v1, . . . , v`), (w1, . . . , w`)) the induced graphs are not isomorphic. More precisely, Spoiler wins if there is an i ∈ [`] such that χG(vi) 6= χH (wi) or there are i, j ∈ [`] such that vi = vj ⇔/ wi = wj or vivj ∈ E(G) ⇔/ wiwj ∈ E(H). If the play never ends Duplicator wins.

We say that Spoiler (resp. Duplicator) wins the bijective k-pebble game BPk(G, χG, H, χH ) if Spoiler (resp. Duplicator) has a winning strategy for the game. Also, if G and H have a different number of vertices, Spoiler immediately wins the game. Moreover, for positions (¯v, w¯) ∈ V (G)` × V (H)`, ` ≤ k, Spoiler (resp. Duplicator) wins the game BPk(G, χG, H, χH ) from position (¯v, w¯) if Spoiler (resp. Duplicator) has a winning strategy in the game BPk(G, χG, H, χH ) started from initial position (¯v, w¯). I remark that the definition of the pebble game provided above does not match the standard definition of the game which combines removing a pebble and placing a new pebble into a single move of the game. This has the advantage that the the number of moves required for Spoiler to win the game (provided he has a winning strategy) exactly corresponds to the number of iterations the Weisfeiler-Leman algorithm performs until stabilization. However, for the purpose of providing winning strategies for explicit families of input graphs (which is the main focus in this thesis) the above variant is more convenient. The next theorem connects the Weisfeiler-Leman algorithm and bijective pebble games. Theorem 2.2.6 ([31, 76]). Let G, H be two graphs and let v¯ ∈ V (G)k and w¯ ∈ V (H)k. Then k k χWL[G](¯v) = χWL[H](w ¯) if and only if Duplicator wins the pebble game BPk+1(G, H) from the position (¯v, w¯).

Corollary 2.2.7. Let G, H be two graphs. Then G 'k H if and only if Duplicator wins the pebble game BPk+1(G, H). 20 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS

2.2.3 Connection to Logic Another characterization of the Weisfeiler-Leman algorithm can be formulated in terms of first- order logic with counting quantifiers. As usual, first order logic (FO) is build inductively from the atomic formulas. As this thesis is primarily concerned with graphs we restrict ourselves to the case where the vocabulary only consists of a single binary relation symbol E. In this case the atomic FO-formulas are of the form x = y and Exy. First order formulas are build from atomic formulas in an inductive fashion using the Boolean operators ∧, ∨ and ¬, existential quantifiers ∃xϕ(x) and universal quantifiers ∀xϕ(x). First order logic with counting quantifiers, denoted by C, is the extension of first order logic by counting quantifiers of the form ∃≥`ϕ(x). A graph G satisfies such a formula if there are at least ` distinct v ∈ V (G) such that G |= ϕ(v). Note that first order logic with counting quantifiers has the same expressive power as first order logic since each counting quantifier ∃≥`ϕ(x) can be replaced by   ^ ^ ∃x1 ... ∃x`  xi 6= xj ∧ ϕ(xi) . i6=j∈[`] i∈[`]

Let Lk be the restriction of FO to formulas that use at most k distinct variables (each variable may be requantified multiple times). Also, let Ck be the restriction of C to formulas that use at most k distinct variables. Note that, while FO and C have the same expressive power, this is not the case for Lk and Ck. Before connecting the logic Ck to the Weisfeiler-Leman algorithm we first provide some ex- amples for the logic Ck.

Example 2.2.8. Let G be a graph and v, w ∈ V (G).A walk from v to w is a sequence v = 3 u0, u1, . . . , u` = w such that ui−1ui ∈ E(G) for all i ∈ [`] and ui 6= w for all i ∈ [` − 1] . In this case ` is the length of the walk. Consider the formulas walk0(x, y) := x = y, walk1(x, y) := Exy, and

walk`(x, y) := ∃z(z 6= y ∧ E(x, z) ∧ walk`−1(z, y)) for ` ≥ 2. Then G |= walk`(v, w) if and only if there is a walk of length ` from v to w. By reusing 3 variables walk`(x, y) ∈ C . Moreover, the formula _ walkG(x, y) := walk`(x, y) `∈[|V (G)|] states that there is a walk from v to w in G. For later reference, also consider the following generalization. Let s1, . . . , sk ∈ V (G) be k k+3 additional vertices. Then there exists a formula walkG(x, y, z1, . . . , zk) ∈ C such that G |= walkG(v, w, s1, . . . , sk) if and only if there is a walk from v to w in G − {s1, . . . , sk}.

The last example does not use any counting quantifiers. It turns out that by utilizing the counting quantifiers one can not only express whether there is a walk of a certain length between two vertices, but one can also count the number of such walks.

Example 2.2.9. As in the last example, let G be a graph and v, w ∈ V (G). Consider the n,1 n,0 n,r formulas walk0 (x, y) := x = y, walk0 (x, y) := x 6= y and walk0 (x, y) := x 6= x for all r ≥ 2,

3The last condition is usually not part of the definition of a walk. This part is added mainly for technical reasons. 2.3. INDIVIDUALIZATION-REFINEMENT 21

n,1 n,0 n,r and walk1 (x, y) := E(x, y), walk1 (x, y) := ¬E(x, y) and walk1 (x, y) := x 6= x for all r ≥ 2, and  ! n,r _ =d _ ^ =sj n,rj walk` (x, y) := ∃ z (z 6= y ∧ E(x, z)) ∧ ∃ z(E(x, z) ∧ walk`−1 (z, y)) P d∈[n] sj =d j P sj rj =r

n,r for ` ≥ 2. Let n := |V (G)|. Then G |= walk` (v, w) if and only if there are exactly r walks of n,r 3 length ` from v to w. By reusing variables walk` (x, y) ∈ C . Additionally, given vertices s1, . . . , sk, one can also count walks of a certain length in the k+3 graph G − {s1, . . . , sk} in the logic C . The following theorem connects the k-bijective pebble game to the logic Ck.

Theorem 2.2.10 (Hella [76]). Let G, H be two graphs and suppose k ≥ `. Also let v1, . . . , v` ∈ V (G) and w1, . . . , w` ∈ V (H). Then Spoiler wins the game BPk(G, H) starting from position k ((v1, . . . , v`), (w1, . . . , w`)) if and only if there is a formula ϕ(x1, . . . , x`) ∈ C such that G |= ϕ(v1, . . . , v`) and H 6|= ϕ(w1, . . . , w`).

k For two graphs G, H we define G ≡Ck H if for every sentence ϕ ∈ C it holds that G |= ϕ if and only if H |= ϕ.

Corollary 2.2.11. Let G, H be two graphs. Then Duplicator wins the game BPk(G, H) if and only if G ≡Ck H.

2.3 Individualization-Refinement 2.3.1 The Basic Paradigm The k-dimensional Weisfeiler-Leman algorithm provides a polynomial time algorithm for every fixed number k. However, from the perspective of designing practically efficient algorithms for the Graph Isomorphism Problem there are several downsides. First, the k-dimensional Weisfeiler- Leman algorithm fails to decide isomorphism on its own for every constant k. Actually, there are graphs whose Weisfeiler-Leman dimension is linear in their number of vertices [31]. But more importantly, while a polynomial time algorithm for constant k, the Weisfeiler-Leman algorithm requires an exhaustive amount of memory. This makes the algorithm rather inefficient in practice even for small values of k. Another approach to the Graph Isomorphism Problem, that works particularly well in prac- tice, is the Individualization-Refinement paradigm (I/R paradigm). This paradigm is imple- mented by all current state-of-the-art isomorphism solvers. This includes the software packages Nauty/Traces [112, 113], Bliss [85, 86], Conauto [105] and Saucy [35]. Also, the I/R paradigm is commonly used in a theoretical context (see, e.g., [9, 14, 23, 141, 142]) which includes Babai’s quasipolynomial time algorithm for graph isomorphism [11]. An extensive description of the paradigm of individualization-refinement algorithms can be found in [113]. The basic strategy of these algorithms is to capture information about the structure of a graph by coloring the vertices. An initially uniform coloring is refined in an isomorphism-invariant manner whenever feasible, followed by artificially distinguishing vertices in a form of backtracking. This approach can be used to decide isomorphism of two graphs, but also for computing the automorphism group or a canonization of a single graph. A refinement operator is a function ref that takes a colored graph G and refines the coloring in an isomorphism-invariant fashion. More formally, a refinement operator ref takes a pair (G, χG) 22 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS

∼ and outputs ref(G, χG)  χG. Also, ref is isomorphism-invariant meaning that if ϕ:(G, χG) = ∼ (H, χH ) is an isomorphism then also ϕ:(G, ref(G, χG)) = (H, ref(H, χH )). A typical choice for such a refinement is the 1-dimensional Weisfeiler-Leman algorithm. In order to obtain a practically efficient algorithm it is vital that the refinement operator can be evaluated quickly since this subroutine is called at every node of the backtracking tree. Recall that for the 1- dimensional Weisfeiler-Leman algorithm this can be done in almost linear time (see Theorem 2.2.2). In case the refinement operator ref produces a discrete coloring on the graph G it is trivial to check whether this graph is isomorphic to another graph H. Indeed, the refinement of H must also be discrete and there is at most one color preserving bijection between the vertex sets which can be trivially checked for being an isomorphism. When choosing the Color Refinement algorithm as a refinement operator, this is already the case asymptotically almost surely [17]. However, if the coloring produced by ref is not discrete one needs to do more work. In this case the strategy is to select a color class of the coloring produced by ref, usually called a cell, and to individualize a single vertex from that class. As before, individualization means to refine the coloring by giving the vertex a new singleton color. Since such an operation is not necessarily isomorphism-invariant, we branch over all choices of this vertex within the chosen cell. To the updated coloring with the newly individualized vertex, the algorithm applies the refinement operator again and proceeds in a recursive fashion. More formally, an algorithm implementing the I/R paradigm works as follows. Let G be a graph and χG : V (G) → C a coloring of the vertices. A cell selector is an isomorphism-invariant −1 function sel which maps (G, χG) to sel(G, χG) ∈ C with |χG (sel(G, χG))| ≥ 2 if such a color exists. If the coloring χG is discrete the cell selector returns sel(G, c) =⊥. Here, isomorphism- ∼ invariant means that if (G, χG) = (H, χH ) then the cell selector chooses the same color, i.e., sel(G, χG) = sel(H, χH ). The performance of an individualization-refinement algorithm can drastically depend on the cell selection strategy (see, e.g., [113, 118]).

Let G be a graph with an initial coloring χG : V (G) → C. Let sel be a cell selector and ref ref,sel a refinement operator. With this, a backtracking tree T [G, χG] is defined as follows. The root of the tree is labeled with the empty sequence ε. Letv ¯ = (v1, . . . , v`) be a node of the search tree. Let χ := ref(G, χG, v¯) be the coloring computed by the refinement operator after individualizing the vertices from the current sequence and let c := sel(G, χ) be the color selected by the cell selector. If c =⊥ thenv ¯ is a leaf of the search tree and the coloring χ is discrete. −1 Otherwise, for each w ∈ χ (c), there is a child node labeled with (v1, . . . , v`, w). The vertices of the search tree are referred to as nodes and they are identified with the sequence of vertices they are labeled with. Together a cell selector and a refinement operator are sufficient to build a correct isomorphism test. Indeed, two graphs are isomorphic if and only if they have isomorphic leaves in their search ∼ ref,sel trees. More precisely, (G, χG) = (H, χH ) if and only if there are leavesv ¯ ∈ V (T [G, χG]) and ref,sel ∼ w¯ ∈ V (T [H, χH ]) such that (G, χG, v¯) = (H, χH , w¯). Since ref(G, χG, v¯) and ref(H, χH , w¯) ∼ are discrete colorings it is easy to check whether (G, χG, v¯) = (H, χH , w¯). Actually, for an ∼ ref,sel isomorphism ϕ:(G, χG) = (H, χH ), ifv ¯ ∈ V (T [G, χG]) is a leaf then ϕ(¯v) is a leaf of ref,sel ∼ T [H, χH ]. Moreover, ϕ is the unique isomorphism from (G, χG, v¯) = (H, χH , ϕ(¯v)). This way, one can also compute the automorphism group of (G, χG) comparing all pairs of leaves of ref,sel T [G, χG]. Finally, the I/R paradigm can also be used to build a canonization algorithm. Towards this end, the only thing that is required is a graph invariant inv that maps colored graphs (G, χG) to a totally ordered set X such that for all discretely colored graphs (G, χG) and (H, χH ) it holds ∼ (G, χG) = (H, χH ) if and only if inv(G, χG) = inv(H, χH ). A complete invariant for discretely colored graphs is easy to obtain using the total order on the vertices coming from the vertex 2.3. INDIVIDUALIZATION-REFINEMENT 23 colors (see, e.g., Example 2.1.6).

2.3.2 Pruning with Invariants While this basic framework already works quite well for a number of graphs there are two further ingredients that are crucial for the efficiency of practical individualization-refinement algorithms. These are the use of node invariants and the exploitation of automorphisms. Let X be a totally ordered set. A node invariant is an isomorphism-invariant function inv taking a colored graph ` (G, χG) and a sequencev ¯ = (v1, . . . , v`) ∈ V (G) to an element inv(G, χG, v¯) ∈ X such that for ` ` all vertex sequencesv ¯ ∈ V (G) , all colored graphs (H, χH ) and allw ¯ ∈ V (H)

(i) if inv(G, χG, (v1, . . . , v`)) < inv(H, χH , (w1, . . . , w`)) then it also holds for all v ∈ V (G), w ∈ V (H) that inv(G, χG, (v1, . . . , v`, v)) < inv(H, χH , (w1, . . . , w`, w)), and ∼ (ii) if ref(G, χG, v¯) is discrete and inv(G, χG, v¯) = inv(H, χH , w¯) then (G, χG, v¯) = (H, χH , w¯). Note that, due to the set X being totally ordered, isomorphism-invariant means the function inv ∼ is a graph invariant, i.e., (G, χG, v¯) = (H, χH , w¯) implies inv(G, χG, v¯) = inv(H, χH , w¯). Now let inv be a node invariant and define

ref,sel ref,sel I := { v¯ ∈ V (T [G, χG]) | @ w¯ ∈ V (T [G, χG]): |v¯| = |w¯| ∧ (inv(G, χG, w¯) < inv(G, χG, v¯)) }.

ref,sel ref,sel Finally, define the search tree Tinv [G, χG] = (T [G, χG])[I] as the subtree induced by the ref,sel node set I. Observe that Property (i) implies that Tinv [G, χG] is indeed a tree. By using the invariant the algorithm thus may cut off the parts of the search tree that do not have nodes that are minimal among all nodes on their level. However, due to isomorphism invariance, the property that two graphs are isomorphic if and only if they have isomorphic leaves remains. The use of a node invariant also makes it easy to define a canonization as it defines a complete invariant for discretely colored graphs. Thus, for a graph canonization, the algorithm may just pick the leaf with the minimal node invariant. When an individualization-refinement algorithm is used as an isomorphism test on two input graphs (G, χG) and (H, χH ), the invariant can be used across both graphs and only the minimum among all nodes is maintained. Specifically, in this case one can define

ref,sel I := {v¯ ∈ V (T [G, χG]) | ref,sel @ w¯ ∈ V (T [G, χG]): |v¯| = |w¯| ∧ (inv(G, χG, w¯) < inv(G, χG, v¯)) ∧ ref,sel @ w¯ ∈ V (T [H, χH ]): |v¯| = |w¯| ∧ (inv(H, χH , w¯) < inv(G, χG, v¯))}.

ref,sel This gives a pair of search trees Tinv [G, χG, H, χH ] for which each tree may be significantly smaller than when the algorithm is executed separately. Still, the graphs are non-isomorphic if the search trees are structurally different which can be detected in various different ways. A concrete method that is often used is to compare whether the smallest leaf of each tree corresponds to the same discretely colored graph.

2.3.3 Pruning with Automorphisms The second essential ingredient required for the practicality of I/R algorithms is the exploitation ref,sel ∼ of automorphisms. Letv, ¯ w¯ ∈ V (T [G, χG]) such that ϕ:(G, χG, v¯) = (G, χG, w¯). Then the ref,sel isomorphism extends to the subtrees of T [G, χG] rooted atv ¯ andw ¯ in the natural way. In 24 CHAPTER 2. ISOMORPHISM AND COMBINATORIAL ALGORITHMS particular, the set of all elements inv(¯v0) wherev ¯0 ranges over all leaves in the subtree rooted at v¯ equals the set of all inv(w ¯0) wherew ¯0 ranges over all leaves in the subtree rooted atw ¯ (see, e.g., [113] for details). Hence, for the purpose of isomorphism testing, automorphism group compu- tation and computing a graph canonization, it suffices to traverse only one of the subtrees. This means automorphisms that are detected by an I/R-algorithm can be used to cut off further parts of the search tree. An efficient strategy for the detection of automorphisms is an essential part of individualization-refinement algorithms. For example, a very simple strategy is to maintain a list of visited leaves and, whenever the next leaf is visited, compare this leaf with all the leaves in the list trying a find automorphisms. Besides this simple strategy, practical implementations of the I/R-paradigm often utilize additional heuristics for automorphism detection. Since this thesis is only concerned with providing lower bounds on the complexity of I/R algorithms we do not require any specific implementation details on automorphism detection. Instead, we rely on the following generic lower bound on the running time of such algorithms. Proposition 2.3.1. The running time of an individualization-refinement algorithm with cell selector sel, refinement operator ref and node invariant inv on a graph G is bounded from below ref,sel by |Tinv [G]|/| Aut(G)|. The argument for the correctness of this proposition is simply that the individualization-re- ref,sel finement algorithm touches during its execution on G for each nodev ¯ ∈ Tinv [G] at least one node equivalent tov ¯ under the automorphism group Aut(G) (see [112]). Chapter 3

Upper Bounds on the Weisfeiler-Leman Dimension

The Weisfeiler-Leman algorithm is one of most fundamental subroutines in the context of the Graph Isomorphism Problem and, over the past decades, its power has been intensively studied in various contexts. In the area of graph isomorphism testing, one of the most prominent examples is Babai’s quasipolynomial time algorithm [11] which employs the algorithm for dimension k = O(log n). Moreover, more recently, the Weisfeiler-Leman algorithm has also found applications in areas such as machine learning (see, e.g., [139, 120]). By a famous result of Cai, F¨urerand Immerman [31] it is well-known that, for any fixed dimension, the Weisfeiler-Leman algorithm itself fails to decide the Graph Isomorphism Problem. However, for several restricted classes of graphs it has been shown the k-dimensional Weisfeiler- Leman algorithm serves as a complete isomorphism test. In the following we prove such a statement for graphs of bounded tree-width and, more generally, graphs of bounded rank-width. For graphs of tree-width at most k such a statement was first proved by Grohe and Mari˜no [66] providing an upper bound of k + 2 on the Weisfeiler-Leman dimension of graphs of tree- width at most k. In this chapter, we improve on this result giving an upper bound of k on the Weisfeiler-Leman dimension of graphs of tree-width at most k. On the other hand, for graphs of bounded rank-width, it was previously unknown whether the Weisfeiler-Leman algorithm serves as a complete isomorphism test. Indeed, in [74], Grohe and Schweitzer explicitly ask whether the Weisfeiler-Leman dimension of graphs of rank-width k, k ∈ N, is bounded by some function in k.

3.1 Tree-Width

For analyzing the Weisfeiler-Leman dimension of the class of graphs of tree-width at most k recall the basic definitions around tree decompositions presented in Subsection 2.1.2. For the most basic case it is well-known that the class of forests (i.e., graphs of tree-width at most 1) has Weisfeiler-Leman dimension 1.

Theorem 3.1.1 (see [82]). The Color Refinement algorithm identifies all forests.

For the class of graphs of tree-width at most k, k ≥ 2, the first bound on the Weisfeiler-Leman dimension was obtained by Grohe and Mari˜no[66] giving an upper bound of k + 2. The proof is essentially based on providing a winning strategy for Spoiler in (k + 3)-bijective pebble game

25 26 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

BPk+3(G, H) where G and H are non-isomorphic graphs such that tw(G) ≤ k. For his strategy Spoiler basically plays along a tree decomposition of the graph G in a top-down fashion always moving into a part of the tree decomposition that is non-isomorphic from the corresponding part in the second graph. An important ingredient of this strategy is Spoiler’s ability to recognize separators of the graph meaning that Spoiler has a winning strategy if the pebbled vertices in the first graph form a separator whereas the same does not hold in the second graph. Actually, we shall require the following slightly stronger property. k+1 Let G be a graph. For a (k + 1)-tuple (v1, . . . , vk, vk+1) ∈ V (G) we define

sG(v1, . . . , vk, vk+1) := |C| where C is the unique component of G − {v1, . . . , vk} such that vk+1 ∈ C (if vk+1 ∈ {v1, . . . , vk} then sG(v1, . . . , vk, vk+1) := 0).

Definition 3.1.2. Let k, ` ∈ N such that ` ≥ k. The `-dimensional Weisfeiler-Leman algorithm is aware of k-separators if, for all graphs G, H, Spoiler wins the game BP`+1(G, H) from initial position ((v1, . . . , vk+1), (w1, . . . , wk+1)) for all vertices v1, . . . , vk+1 ∈ V (G) and w1, . . . , wk+1 ∈ V (H) such that sG(v1, . . . , vk, vk+1) 6= sH (w1, . . . , wk, wk+1). The terminology chosen for this definition relates to Corollary 2.2.7 stating that the `- dimensional Weisfeiler-Leman algorithm is equivalent in its power to the (` + 1)-bijective pebble game. Now, our approach is first to prove that, assuming the `-dimensional Weisfeiler-Leman al- gorithm is aware of k-separators, it identifies every graph of tree-width at most k (assuming ` ≥ k). This is achieved by providing a winning strategy for Spoiler in the game BP`+1(G, H) where G and H are non-isomorphic graphs such that tw(G) ≤ k. Afterwards, we argue that the `-dimensional Weisfeiler-Leman algorithm is aware of k-separators gradually improving on the value of ` by considering more and more complex strategies for Spoiler. For the first step we require the following characterization of tree-width. Let G be a graph. For a k-element separator S ⊆ V (G) and C a connected component of G − S, we define the graph G(S, C) to be the graph induced by S ∪ C together with the complete set of edges in S.

Lemma 3.1.3 (Arnborg et al. [2]). Suppose G(S, C) has at least k + 2 vertices. Then G(S, C) has tree-width at most k if and only if there exists v ∈ C such that for every connected component A of G[C \{v}] there is a k-element separator SA ⊆ S ∪ {v} such that

1. no vertex in A is adjacent to the unique element from S \ SA, and

2. G(SA,A) has tree-width at most k.

Suppose G(S, C) has tree-width at most k. In this case let DG(S, C) denote the set of possible vertices v ∈ C that satisfy the lemma above.

Theorem 3.1.4. Suppose k ≥ 2 and let ` ≥ k such that the `-dimensional Weisfeiler-Leman algorithm is aware of k-separators. Let G be a graph of tree-width at most k. Then the `- dimensional Weisfeiler-Leman algorithm identifies G.

Proof. Let G be a connected graph of tree-width k and suppose H is a second connected graph such that G =6∼ H. Let (T, β) be a tree decomposition of the graph G of width k. For a separator S ⊆ V (G) and an integer m ∈ N we define

CG(S, m) := {C ⊆ V (G) | C is a connected component of G − S of size m}. 3.1. TREE-WIDTH 27

Moreover [ G(S, m) := G[S ∪ C].

C∈CG(S,m)

An ordered separator is a tuplea ¯ = (a1, . . . , ak) such that the underlying set {a1, . . . , ak} is a separator. In this proof, slightly abusing notation, we do not distinguish between ordered separators and the underlying unordered separator. For two ordered separatorsa ¯ ∈ V (G)k and ¯b ∈ V (H)k we define m(¯a, ¯b) to be the minimal number m ≥ 1 such that (G(¯a, m), a¯) =6∼ (H(¯b, m), ¯b). We now argue that Spoiler wins the game BP`+1(G, H). Suppose the game is in a position (¯a, ¯b) ∈ V (G)k ×V (H)k wherea ¯ ⊇ γ(s)∩γ(t) for an edge st ∈ E(T ). We shall prove by induction on m := m(¯a, ¯b) that Spoiler wins the game from position (¯a, ¯b). In each case Spoiler wishes to play another pebble. Let f : V (G) → V (H) be the bijection chosen by Duplicator. Since the `-dimensional Weisfeiler-Leman algorithm is aware of k-separators it can be assumed that f ¯ maps the vertex set of G(¯a, m) to the vertex set of H(b, m). Now let C ∈ CG(¯a, m) such that

0 0 ∼ |{C ∈ CG(¯a, m) | (G(¯a, C ), a¯) = (G(¯a, C), a¯)}| 0 ¯ ¯ 0 ¯ ∼ > |{C ∈ CH (b, m) | (H(b, C ), b) = (G(¯a, C), a¯)}|.

Also let 0 0 0 ∼ D := {v ∈ DG(¯a, C ) | C ∈ CG(¯a, m) ∧ (G(¯a, C ), a¯) = (G(¯a, C), a¯)}

Then there exists some v ∈ D such that the following holds. Let CG ∈ CG(¯a, m) such that ¯ v ∈ CG and CH ∈ CH (b, m) such that f(v) ∈ CH . Then ∼ ¯ ¯ (G[CG ∪ a¯], a,¯ v) =6 (G[CH ∪ b], b, f(v)). (3.1)

Now Spoiler places pebbles on (v, w) where w = f(v). For the base case of the induction suppose m = 1. This means CG = {v} and CH = {w} and thus, Spoiler wins immediately. So assume m > 1. Let A1,...,A` ⊆ CG be the connected components of G[CG \{v}]. Note that |Ai| ≤ m − 1 for every i ∈ [`]. Also let B1,...,B`0 ⊆ CH be the connected components of H[CH \{w}]. Because of Equation (3.1) there is some A ∈ {A1,...,A`} such that ∼ |{i ∈ [`] | G[Ai ∪ a¯ ∪ {v}], a,¯ v) = G[A ∪ a¯ ∪ {v}], a,¯ v)}| 0 ¯ ¯ ∼ > |{i ∈ [` ] | H[Bi ∪ b ∪ {w}], b, w) = G[A ∪ a¯ ∪ {v}], a,¯ v)}|.

0 We pick such a set A ∈ {A1,...,A`} with minimal cardinality (i.e. there is no set A ∈ {A1,...,A`} satisfying the above property which is strictly smaller than A). Now supposea ¯ = ¯ (a1, . . . , ak) and b = (b1, . . . , bk). Pick i ∈ [k] such that no vertex in A is adjacent to ai (cf. Lemma 0 3.1.3). Now Spoiler removes the pair of pebbles (ai, bi). Leta ¯ = (a1, . . . , ai−1, ai+1, . . . , ak, v) ¯0 0 ¯0 and b = (b1, . . . , bi−1, bi+1, . . . , bk, w). Observe that (¯a , b ) is the current position of the game. 0 0 0 Now let m := |A| < m. Note that A ∈ CG(¯a , m ). Claim 1. (G(¯a0, m0), a¯0) =6∼ (H(¯b0, m0), ¯b0). Proof. To prove the claim it suffices to argue that |A| > |B| where

0 0 0 0 0 0 ∼ 0 0 A := {A ∈ CG(¯a , m ) | (G(¯a ,A ), a¯ ) = (G(¯a ,A), a¯ )} and 0 ¯0 0 ¯0 0 ¯0 ∼ 0 0 B := {B ∈ CH (b , m ) | (H(b ,B ), b ) = (G(¯a ,A), a¯ )}. 28 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

0 0 0 00 0 0 0 0 Let A = {A ∈ A | A ⊆ CG} and A = A\A . Similarly define B = {B ∈ B | B ⊆ CH } and B00 = B\B0. By the definition of the set A it follows that |A0| > |B0|. Now define   0 [ 00 [ G := Ga¯ ∪ {v} ∪ CG(¯a, m ) ∪ Ai 00 0 0 m ≤m i∈[`]: |Ai|

k+1 Proof. Let G, H be two graphs,v ¯ = (v1, . . . , vk+1) ∈ V (G) andw ¯ = (w1, . . . , wk+1) ∈ k+1 V (H) such that sG(¯v) 6= sH (w ¯). Without loss of generality assume sG(¯v) > sH (w ¯). Let CG be the connected component of G − {v1, . . . , vk} such that vk+1 ∈ CG and similarly, let CH be the connected component of H − {w1, . . . , wk} such that wk+1 ∈ CH . To prove the lemma it needs to be argued that Spoiler wins the game BPk+3(G, H) from position (¯v, w¯). First, Spoiler plays an additional pair of pebbles. Let f : V (G) → V (H) be the bijection chosen by Duplicator. Since |CG| > |CH | there is some v ∈ CG such that w := f(v) ∈/ CH . Spoiler places pebbles on (v, w). Then there is a walk from vk+1 to v in G − {v1, . . . , vk}, but there is no walk from wk+1 to w in H − {w1, . . . , wk}. So Spoiler wins from the current position by Theorem 2.2.10 and Example 2.2.8. Corollary 3.1.6 (Grohe and Mari˜no[66]). The (k+2)-dimensional Weisfeiler-Leman algorithm identifies every graph of tree-width at most k. Proof. This follows from Lemma 3.1.5 and Theorem 3.1.4. The strategy presented in the last lemma is extremely simple and indeed, from the logical point of view, it does not even require the use of counting quantifiers (cf. Example 2.2.8 and Theorem 2.2.10). Using the ability of the 2-dimensional Weisfeiler-Leman algorithm to count the number of walks of a certain length between two given vertices, the bound on ` can be slightly improved. A very similar argument is also used in [90] in order to show that the 2-dimensional Weisfeiler-Leman algorithm can detect cut vertices1.

1A vertex v ∈ V (G) is a cut vertex if the singleton set {v} is a separator of G. 3.1. TREE-WIDTH 29

Lemma 3.1.7. For k ≥ 1, the (k + 1)-dimensional Weisfeiler-Leman algorithm is aware of k-separators. Proof. The basic approach is very similar to the proof of Lemma 3.1.5. Let G, H be two graphs, k+1 k+1 v¯ = (v1, . . . , vk+1) ∈ V (G) andw ¯ = (w1, . . . , wk+1) ∈ V (H) such that sG(¯v) 6= sH (w ¯). Without loss of generality assume sG(¯v) > sH (w ¯). Let CG be the connected component of G − {v1, . . . , vk} such that vk+1 ∈ CG and similarly, let CH be the connected component of H − {w1, . . . , wk} such that wk+1 ∈ CH . It needs to be proved that Spoiler wins the game BPk+2(G, H) from position (¯v, w¯). First, Spoiler plays an additional pair of pebbles. Let f : V (G) → V (H) be the bijection chosen by Duplicator. Since |CG| > |CH | there is some vk+2 ∈ CG such that wk+2 := f(v) ∈/ CH . Spoiler places pebbles on (vk+2, wk+2). 0 0 0 Now let G = G − {v1, . . . , vk−1} and H = H − {w1, . . . , wk−1}. For v, w ∈ V (G ) define G0 0 W` (v, w) to be the number of walks of length ` from v to w in G (see Example 2.2.8 for the definition of a walk). By Theorem 2.2.10 and Example 2.2.9 it suffices to prove that there are G0 H0 i, j ∈ {k, k + 1, k + 2} and ` ≤ |V (G)| such that W` (vi, vj) 6= W` (wi, wj). Towards this end, G0 H0 suppose that W` (vk, vi) = W` (wk, wi) for all i ∈ {k + 1, k + 2} and ` ≤ |V (G)|. Since every 0 walk from wk+1 to wk+2 in the graph H has to pass wk it holds that

`−1 H0 X H0 H0 W` (wk+1, wk+2) = Wi (wk+1, wk) · W`−i(wk, wk+2). i=1

Note that each walk from wk+1 to wk+2 uniquely decomposes into two parts splitting the walk 0 at vertex wk since the end-vertex of a walk may not appear multiple times. However, in G there is at least one walk from vk+1 to vk+2 that does not pass through vk. Let d be its length. Then

d−1 G0 X G0 H0 H0 Wd (vk+1, vk+2) ≥ 1 + Wi (vk+1, vk) · W`−i(vk, vk+2) > Wd (wk+1, wk+2). i=1

Corollary 3.1.8. The (k + 1)-dimensional Weisfeiler-Leman algorithm identifies every graph of tree-width at most k. The strategy given in the last lemma is still fairly simple. Indeed, it is possible to further improve on the bound on `, although at the price of a much more complicated strategy. Proposition 3.1.9 (Kiefer, N. [89]). For k ≥ 2, the k-dimensional Weisfeiler-Leman algorithm is aware of k-separators. Since the proof of this proposition is very complicated and lengthy I omit it in this thesis and rather only refer to the original work [89]. Corollary 3.1.10. The k-dimensional Weisfeiler-Leman algorithm identifies every graph of tree- width at most k. From an algorithmic point of view this result implies that the Graph Isomorphism Problem for graphs of tree-width at most k can be solved in time O(nk+1 log n) by Theorem 2.2.1. This slightly improves on the running time of the first polynomial time isomorphism test for graphs of bounded tree-width due to Bodlaender [27]. Moreover, the Graph Canonization Problem for graphs of tree-width k can be solved in time O(nk+3 log n) by Theorem 2.2.5. However, from the algorithmic perspective, both algorithms are far from optimal and both problems are fixed-parameter tractable where the parameter is the tree-width of the input graphs [104]. Ac- tually, in Chapter 7, we present an algorithm solving the Graph Isomorphism Problem in time 2k polylog(k) poly(n) where k denotes the tree-width of the input graphs. 30 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

3.2 Rank-Width

Over the past decades, it has been proved for many graph classes that their Weisfeiler-Leman dimension is finite. Besides graph classes of bounded tree-width, this also includes for example planar graphs [61], graph classes of bounded genus [62] and, more generally, graph classes that exclude a fixed graph as a minor [63] (more details are given in Section 3.3). However, most graph classes for which such results are known contain only sparse graphs. In contrast, an important collection of graph classes also containing dense graphs are classes of bounded rank-width (in particular, every complete graph has rank-width 1). The graph parameter rank-width was first defined by Oum and Seymour [130] in connection with graphs of bounded clique-width, another graph parameter which is closely related to rank-width. The first polynomial-time algorithm for the Graph Isomorphism Problem for graph classes of bounded rank-width was given by Grohe and Schweitzer [73]. However, their algorithm is rather complicated using both group-theoretic techniques and advanced results from structural graph theory [74]. In this thesis, we show that the Weisfeiler-Leman dimension of graphs of rank-width k is at most 3k + 4 which results in a simple polynomial-time isomorphism test which, maybe surprisingly, is also significantly faster than the algorithm from [73]. The results given in this section can also be found in [68].

3.2.1 Definition and Properties We start by defining the graph parameters rank-width and also clique-width and stating some basic properties about them including their relation to tree-width. Rank-width is another graph invariant first introduced by Oum and Seymour [130] which, similar to tree-width, measures the width of a certain style of hierarchical decomposition of a graph. Intuitively, the aim is to repeatedly split the vertex set of the graph along cuts of low complexity. For rank-width, the complexity of a cut is measured in terms of the rank of the matrix capturing the adjacencies between the two sides of the cut over the 2-element field F2. n×n Let G be a graph. Let AG ∈ F2 denote the adjacency matrix of G. For X,Y ⊆ V (G) X×Y also define the submatrix AG[X,Y ] ∈ F2 where AG[X,Y ]x,y := 1 if and only if xy ∈ E(G). For X ⊆ V (G) the complexity of cutting the graph along X can now be measured by ρG(X) := rk2(AG[X, X]) where X := V (G) \ X and rk2(A) denotes the F2-rank of a matrix A. Definition 3.2.1 (Rank Decomposition and Rank-Width). A rank decomposition of G is a tuple (T, γ) consisting of a binary directed tree T and a mapping γ : V (T ) → 2V (G) such that

(R.1) γ(r) = V (G) where r is the root of T ,

(R.2) γ(t) = γ(s1) ∪ γ(s2) and γ(s1) ∩ γ(s2) = ∅ for all internal nodes t ∈ V (T ) with children s1 and s2, and

(R.3) |γ(t)| = 1 for all t ∈ L(T ), where L(T ) denotes the set of leaves of the tree T .

Instead of giving γ, one can equivalently also specify a bijection f : L(T ) → V (G). Observe that this completely specifies γ by Condition (R.2). The width of a rank decomposition (T, γ) is

width(T, γ) := max{ρG(γ(t)) | t ∈ V (T )}.

The rank-width of a graph G is

rw(G) := min{width(T, γ) | (T, γ) is a rank decomposition of G}. 3.2. RANK-WIDTH 31

Another graph invariant that is closely related to rank-width is clique-width [41]. It is also a measure of a graph’s structural complexity, but unlike rank-width, it considers the complexity of an algebraic expression defining the graph. For a natural number k ∈ N a k-graph is a pair (G, lab) where G is a graph and lab: V (G) → [k] is a labeling of the vertices. In order to define the clique-width of a graph consider the following four operations for k-graphs:

1. for i ∈ [k] let ·i denote an isolated vertex with label i,

0 0 2. for i, j ∈ [k] with i 6= j define the k-graph ηi,j(G, lab) := (G , lab) where V (G ) := V (G) and E(G0) := E(G) ∪ {vw | lab(v) = i ∧ lab(w) = j},

0 3. for i, j ∈ [k] define ρi→j(G, lab) := (G, lab ) where ( j if lab(v) = i lab0(v) := , lab(v) otherwise

and 4. for two k-graphs (G, lab) and (G0, lab0) define (G, lab) ⊕ (G0, lab0) to be the disjoint union of the two k-graphs. A k-expression t is a well-formed expression in these symbols and defines a k-graph (G, lab). In this case t is a k-expression for G. The clique-width of a graph G, denoted by cw(G), is the minimum k ∈ N such that there is a k-expression for G. Although rank-width and clique-width seem to be quite different measures of a graphs struc- tural complexity they are actually closely related to each other. Indeed, both parameters are bounded in terms of the other one. Theorem 3.2.2 (Oum, Seymour [130]). For every graph G it holds that

rw(G) ≤ cw(G) ≤ 2rw(G)+1 − 1.

Also, there is the following connection to tree-width showing that (up to an additive constant of one) rank-width is a more general graph measure than tree-width. Theorem 3.2.3 (Oum [129]). For every graph G it holds that

rw(G) ≤ tw(G) + 1.

Note that the tree-width of a graph can not be bounded in terms of its rank-width. For example, the complete graph on n vertices Kn has rank-width rw(Kn) = 1 and tree-width tw(Kn) = n − 1. Also, it has clique-width cw(Kn) = 2 (observe that the operation ηi,j can only be performed for distinct i, j ∈ [k]). Similar to tree-width, the measure rank-width is important algorithmically since many NP- hard problems can be solved in polynomial time for graph classes of bounded rank-width (or equivalently, graph classes of bounded clique-width). For example, this includes all problems definable in monadic second order. Actually, all these problems can even be solved in linear time [40]. Further algorithmic results in this direction can also be found in [49]. Considering the Graph Isomorphism Problem there is also a polynomial time algorithm for graph classes of bounded rank-width.

Proposition 3.2.4 (Grohe, Schweitzer [73]). For every fixed k ∈ N the Graph Isomorphism Problem for graphs of rank-width at most k can be solved in polynomial time. 32 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

However, while a polynomial time algorithm for every fixed k, the exponent of the polynomial depends on k in a non-elementary fashion. This is of course quite unsatisfactory. On top of this, the algorithm by Grohe and Schweitzer is very complicated using both group theoretic techniques (which are considered in Chapter 5 of this thesis) and advanced results on graph decompositions and tangles [74]. In this thesis, we improve on this result by showing that the `-dimensional Weisfeiler-Leman algorithm identifies every graph of rank-width at most k for ` = O(k). This immediately results in a simple isomorphism test for graphs of rank-width k running in time nO(k). Moreover, this also results in a polynomial-time canonization algorithm for graphs of bounded rank-width by Theorem 2.2.5.

3.2.2 Split Pairs and Flip Functions Let G be a graph of rank-width k. On an abstract level, the approach to bound the Weisfeiler- Leman dimension of graphs of bounded rank-width is similar to the corresponding argument for graphs of bounded tree-width (see Corollary 3.1.10). For a set X ⊆ V (G) such that ρG(X) ≤ k Spoiler’s goal is to pebble a small set of vertices that splits off the set X. Then, playing along a rank decomposition, Spoiler continues to reduce the size of the relevant set X until eventually it is sufficiently small for Spoiler to win the game. For tree-width, there is a natural way to split the graph into multiple independent parts by pebbling separators of the graph (i.e., the adhesion sets of a tree decomposition). However, for graphs of bounded rank-width, it is far less clear how the graph can be split in a meaningful way. In particular, since there may be many edges going from X to X, one cannot simply remove a few vertices in order to separate X from X. In order to solve this problem, we introduce the notion of split pairs and flip functions. Intuitively, the split pairs take the role of a separator compared to graphs of bounded tree-width whereas the flip functions are used to make independent parts visible when pebbling a split pair. Let G be a graph and X ⊆ V (G). Two vertices v, w ∈ V (G) are X-equivalent, denoted v ∼X w, if they have the same neighbors in X, i.e., N(v) ∩ X = N(w) ∩ X. Also, for v ∈ V (G), X define the vector vecX (v) := (av,w)w∈X ∈ F2 where av,w = 1 if and only if vw ∈ E(G). Observe that v ∼X w if and only if vecX (v) = vecX (w). Moreover, for S ⊆ V (G) let vecX (S) := {vecX (v) | v ∈ S}. In order to analyze the split of the vertex set into X and X we are typically interested in the vectors from the sets vecX (X) and vecX (X). Observe that these sets of vectors correspond to the rows and columns of the matrix AG[X, X].

Observation 3.2.5. Let X ⊆ Y ⊆ V (G) and suppose T ⊆ S ⊆ V (G) such that vecX (S) is linearly independent. Then vecY (T ) is linearly independent.

n n For a set of vectors S ⊆ F2 we denote by hSi the linear space spanned by S. A set B ⊆ F2 is a linear basis for hSi if B is linearly independent and hBi = hSi. Definition 3.2.6 (Split Pairs). Let G be a graph and X ⊆ V (G). A pair (A, B) is a split pair for X if 1. A ⊆ X and B ⊆ X,

2. vecX (A) forms a linear basis for hvecX (X)i, and

3. vecX (B) forms a linear basis for hvecX (X)i.

Note that |A| = ρG(X) and |B| = ρG(X) since vecX (X) is exactly the set of rows of AG[X, X] and vecX (X) is the set of columns of this matrix. Also observe that if (A, B) is a split pair for 3.2. RANK-WIDTH 33

X then (B,A) is a split pair for X. As a special case the pair (∅, ∅) is defined to be a split pair for X = V (G). ¯ An ordered split pair for X is a pair (¯a, b) = ((a1, . . . , aq), (b1, . . . , bp)) such that the corre- sponding sets form a split pair, i.e., ({a1, . . . , aq}, {b1, . . . , bp}) is a split pair for X. Lemma 3.2.7. Let G be a graph, X ⊆ V (G) and suppose (A, B) is a split pair for X. Then

v ∼X w ⇔ v ∼A w for all v, w ∈ X. Similarly

v ∼X w ⇔ v ∼B w for all v, w ∈ X. Proof. Let v, w ∈ X. The forward direction of the equivalence directly follows from the fact that A ⊆ X. So suppose that v ∼A w and assume A = {a1, . . . , aq}. Then, for all i ∈ [q], it holds that vai ∈ E(G) if and only if wai ∈ E(G). Thus

(vecX (ai))v = (vecX (ai))w , that is, the v-entry of the vector vecX (ai) coincides with its w-entry. Since vecX (A) forms a linear basis for hvecX (X)i, it follows that 0 0 (vecX (v ))v = (vecX (v ))w 0 for all v ∈ X. But this means N(v) ∩ X = N(w) ∩ X and therefore, v ∼X w. The second statement is proved analogously. We shall use the last lemma in the following way. In order to split a graph G along a given set X Spoiler may play pebbles on all vertices of an ordered split pair (¯a, ¯b) for X. Now consider the coloring obtained from applying the Color Refinement algorithm to the graph G ¯ 1 ¯ after individualizing all vertices froma ¯ and b, i.e., the coloring χWL[G, a,¯ b]. Corollary 3.2.8. Let G be a graph, X ⊆ V (G) and suppose (¯a, ¯b) is an ordered split pair for 1 ¯ 1 ¯ X. Then v ∼X w for all v, w ∈ X with χWL[G, a,¯ b](v) = χWL[G, a,¯ b](w). Similarly, v ∼X w for 1 ¯ 1 ¯ all v, w ∈ X with χWL[G, a,¯ b](v) = χWL[G, a,¯ b](w). 1 ¯ The main observation is that the colored graph (G, χWL[G, a,¯ b]) consists of multiple “inde- pendent parts” each of which is either completely contained in X or completely contained in X. In order to make these parts visible we consider the concept of a flip function. Definition 3.2.9 (Flip Functions and Flipped Graphs). Let G = (V, E, χ) be a vertex-colored graph where χ: V → C and C is some finite set of colors. A flip function for G is a mapping f : C × C → {0, 1} such that f(c, c0) = f(c0, c) for all c, c0 ∈ C. Moreover, for a graph G = (V, E, χ) and a flip function f define the flipped graph Gf := (V,Ef , χ) where

Ef := {vw | vw ∈ E ∧ f(χ(v), χ(w)) = 0} ∪ {vw | v 6= w ∧ vw∈ / E ∧ f(χ(v), χ(w)) = 1}.

For a colored graph G and a flip function f we denote by Comp(G, f) ⊆ 2V (G) the set of vertex sets of the connected components of Gf . Observe that Comp(G, f) partitions the set V (G). The next lemma forms the first of two main building blocks for describing a winning strategy for Spoiler in a recursive fashion along a given rank decomposition of a graph G. 34 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

X X

0 v w P P

w v0 Q Q . .

Figure 3.1: Visualization of the sets P , P , Q and Q from the proof of Lemma 3.2.10.

Lemma 3.2.10. Let G be a graph, X ⊆ V (G) and suppose (¯a, ¯b) is an ordered split pair for X. 0 1 ¯ Then there is a flip function f for the colored graph G := (G, χWL[G, a,¯ b]) such that for every C ∈ Comp(G0, f) it holds that C ⊆ X or C ⊆ X.

∗ 1 ¯ 0 Proof. Let χ := χWL[G, a,¯ b]. The flip function f is defined in such a way that f(c, c ) = 1 if there are v ∈ X and w ∈ X such that vw ∈ E(G) and {χ∗(v), χ∗(w)} = {c, c0}. We need to argue that there are no v ∈ X and w ∈ X such that vw is an edge in the flipped graph Gf . Suppose towards a contradiction this statement does not hold, that is, there are v ∈ X and w ∈ X such that vw ∈ E(Gf ). Let c := χ∗(v) and c0 := χ∗(w). Then vw∈ / E(G) (if vw ∈ E(G) then f(c, c0) = 1 and thus, vw∈ / E(Gf )) and therefore, f(c, c0) = 1. This means there are v0 ∈ X and w0 ∈ X such that v0w0 ∈ E(G) and {χ∗(v0), χ∗(w0)} = {c, c0}. We distinguish two cases.

∗ 0 ∗ 0 0 0 0 Case χ (v ) = c and χ (w ) = c : Then v ∼X v and w ∼X w by Corollary 3.2.8 which implies

vw ∈ E(G) ⇔ vw0 ∈ E(G) ⇔ v0w0 ∈ E(G).

This is a contradiction.

Case χ∗(v0) = c0 and χ∗(w0) = c: Let P = (χ∗)−1(c)∩X, P = (χ∗)−1(c)∩X, Q = (χ∗)−1(c0)∩X and Q = (χ∗)−1(c0) ∩ X. So v ∈ P , v0 ∈ Q, w ∈ Q and w0 ∈ P (see Figure 3.1).

Claim 1. Let y ∈ P and z ∈ Q. Then yz∈ / E(G).

Proof. We have v ∼X y and w ∼X z by Corollary 3.2.8. Hence,

vw ∈ E(G) ⇔ vz ∈ E(G) ⇔ yz ∈ E(G). y

Claim 2. Let y ∈ Q and z ∈ P . Then yz ∈ E(G).

0 0 Proof. We have v ∼X y and w ∼X z by Corollary 3.2.8. Hence,

0 0 0 v w ∈ E(G) ⇔ v z ∈ E(G) ⇔ yz ∈ E(G). y 3.2. RANK-WIDTH 35

Now |N(v) ∩ Q| = |N(v) ∩ (Q ∪ Q)| = |N(w0) ∩ (Q ∪ Q)| ≥ |Q| by Claim 1 and 2. This implies Q ⊆ N(v) and in particular, v ∈ N(v0). Also, P ⊆ N(v0) by Claim 2. Thus |N(v0)∩(P ∪P )| ≥ |P |+1. Since χ∗(v0) = χ∗(w) = c0 we conclude that |N(w)∩(P ∪P )| ≥ |P | + 1. But on the other hand, |N(w) ∩ (P ∪ P )| = |N(w) ∩ P | ≤ |P | by Claim 1. This is again a contradiction.

In order to be able to treat the connected components of the flipped graph independently we also need to argue that applying a flip function to two graphs neither changes the isomorphism problem nor the outcome of the Weisfeiler-Leman algorithm.

Lemma 3.2.11. Let (G, χG), (H, χH ) be two colored graphs and let f be a flip function for G and H. Also let ϕ: V (G) → V (H) be a bijection. Then ϕ: G =∼ H if and only if ϕ: Gf =∼ Hf .

Proof. Trivial.

Lemma 3.2.12. Let G = (VG,EG, χG),H = (VH ,EH , χH ) be two colored graphs and let f be a ` ` flip function for G and H. Also let (¯v, w¯) ∈ VG × VH be a position in the k-bijective pebble game BPk(G, H) for ` ≤ k. Then Spoiler wins from the position (¯v, w¯) in the game BPk(G, H) if and f f only if Spoiler wins from (¯v, w¯) in BPk(G ,H ).

Proof. A position (¯v, w¯) in the pebble game BPk(G, H) is a winning position for Spoiler if and f f only if it is a winning position for Spoiler in the game BPk(G ,H ).

Recall that two colorings χ, χ0 : V → C are equivalent, denoted χ ≡ χ0, if the partition induced by the color classes is the same for both colorings.

Corollary 3.2.13. Let G = (V, E, χ) be a colored graph and let f be a flip function for G. Then k k f χWL[G] ≡ χWL[G ]. Proof. This follows from Theorem 2.2.6 and Lemma 3.2.12.

k Forv ¯ = (v1, . . . , vk) ∈ V and C ⊆ V we define the tuplev ¯ ∩ C = (vi)i∈I where I = ` {i ∈ [k] | vi ∈ C}. Also, for a second tuplew ¯ = (w1, . . . , w`) ∈ V , we writev ¯ ⊆ w¯ if {v1, . . . , vk} ⊆ {w1, . . . , w`}.

Corollary 3.2.14. Let G = (VG,EG, χG), H = (VH ,EH , χH ) be two colored graphs and let f k k be a flip function for G and H. Let v¯ ∈ VG and w¯ ∈ VH . Let CG be a connected component f of G such that χG(x) 6= χG(y) for all x ∈ CG, y ∈ VG \ CG, and similarly let CH a connected f component of H such that χH (x) 6= χH (y) for all x ∈ CH , y ∈ VH \ CH . Suppose that 1 ∼ 1 (G[CG], χWL[G, v¯]) =6 (H[CH ], χWL[H, w¯])

0 (where both colorings are restricted to the vertex set of the induced subgraphs). Let v¯ =v ¯ ∩ CG 0 and w¯ =w ¯ ∩ CH . Then

1 0 ∼ 1 0 (G[CG], χWL[G, v¯ ]) =6 (H[CH ], χWL[H, w¯ ]) or (G, v¯) 6'1 (H, w¯). 36 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

Proof. Supposev ¯ = (v1, . . . , vk) andw ¯ = (w1, . . . , wk). Let IG = {i ∈ [k] | vi ∈ CG} and f f IH = {i ∈ [k] | wi ∈ CH }. Also assume (G, v¯) '1 (H, w¯). Then (G , v¯) '1 (H , w¯) by Lemma 3.2.12 and Corollary 2.2.7 and thus, IG = IH . Now suppose

1 0 ∼ 1 0 ϕ:(G[CG], χWL[G, v¯ ]) = (H[CH ], χWL[H, w¯ ]).

Since IG = IH it follows that ∼ ϕ:(G[CG], v¯) = (H[CH ], w¯) and, by Lemma 3.2.11, f ∼ f ϕ:(G [CG], v¯) = (H [CH ], w¯). Now a simple inductive argument implies

f 1 ∼ f 1 ϕ:(G [CG], χWL[G, v¯]) = (H [CH ], χWL[H, w¯])

since, in each iteration, the Color Refinement algorithm only takes colors of neighbors into account. Applying Lemma 3.2.11 once again gives the desired statement.

3.2.3 A Recursive Strategy for Spoiler Recall that the goal of this section is to prove the Weisfeiler-Leman dimension of graphs of rank- width at most k is bounded by a function of k. More precisely, given two non-isomorphic graphs G and H, where G has rank-width at most k, we give a winning strategy for Spoiler in the game BP`(G, H) for ` = 3k + 5. Spoiler’s strategy in the game is to play along a rank decomposition (T, γ) for the graph G. At a specific node t ∈ V (T ) of the rank decomposition, Spoiler plays an ordered split pair (¯a, ¯b) for the set γ(t) and identifies a component C (with respect to a suitable flip function) that is different from the corresponding component (specified by the bijection played by Duplicator) in the second graph. In order to distinguish these components, Spoiler continues to play along the rank decomposition in a recursive fashion going down the tree. The main problem that remains to be solved to realize this strategy is to ensure that Spoiler can remove the pebbles from an ordered split pair of t once Spoiler has pebbled ordered split pairs of the children of t. This problem is already partly solved by Corollary 3.2.14 stating that pebbles outside the components we are interested in can be removed without a problem. In order to ensure that no other pebbles need to be removed we introduce the notion of nice (triples of) split pairs. Recall that for sets X,X1,X2 we write X = X1 ] X2 to denote that X is the disjoint union of X1 and X2, that is, X = X1 ∪ X2 and X1 ∩ X2 = ∅.

Definition 3.2.15 (Nice Triples of Split Pairs). Let G be a graph and X,X1,X2 ⊆ V (G) such that X = X1 ]X2. Let (A, B) be a split pair of X and let (Ai,Bi) be split pairs for Xi, i ∈ {1, 2}. The triple of split pairs (A, B), (A1,B1) and (A2,B2) is nice if

(N.1) A ∩ Xi ⊆ Ai,

(N.2) Bi ∩ X ⊆ B, and

(N.3) Bi ∩ X3−i ⊆ A3−i for both i ∈ {1, 2}.

Naturally, a triple of ordered split pairs is nice if the underlying unordered triple of split pairs is nice. 3.2. RANK-WIDTH 37

X X

X1

A B

X2

Figure 3.2: Visualization for the proof of Lemma 3.2.16.

Lemma 3.2.16. Let G be a graph and X,X1,X2 ⊆ V (G) such that X = X1 ] X2. Let (A, B) be a split pair of X. Then there are split pairs (Ai,Bi) for Xi, i ∈ {1, 2}, such that the triple (A, B), (A1,B1) and (A2,B2) is nice. For the proof recall the definition of split pairs (see Definition 3.2.6). A visualization of the situation is also given in Figure 3.2.

Proof. First define the sets A for both i ∈ {1, 2}. Since X ⊆ X the set vec (A ∩ X ) is linearly i i Xi i independent by Observation 3.2.5. Hence, there is a set A ∩ X ⊆ A ⊆ X such that vec (A ) i i i Xi i is a linear basis for vec (X ). So Property (N.1) is satisfied. Xi i It remains to define the sets Bi for both i ∈ {1, 2}. Without loss of generality consider the case i = 1. The set vec (A ) spans every element in the set vec (X ) ⊆ X1∪X . Hence, X2 2 X2 2 F2 X1 vecX1 (A2) spans every element in the set vecX1 (X2) ⊆ F2 . X1∪X2 Moreover, the set vecX (B) spans every element in the set vecX (X) ⊆ F2 . So vecX1 (B) X1 spans every element in the set vecX1 (X) ⊆ F2 . X1 Together this means vecX1 (B ∪A2) spans every element in the set vecX1 (X1) ⊆ F2 . So there exists a set B1 ⊆ B ∪ A2 such that vecX1 (B1) is linearly independent and it spans every element in the set vecX1 (X1). In particular, Properties (N.2) and (N.3) are satisfied for i = 1. Finally, we shall also need the following simple observation.

Observation 3.2.17. Let G, H be two non-isomorphic graphs and let σ : V (G) → V (H) be any bijection. Then there is some v ∈ V (G) such that G[A] =6∼ G[B] where A is the vertex-set of the connected component of G such that v ∈ A and B is the vertex-set of the connected component of H such that σ(v) ∈ B.

Theorem 3.2.18. The (3k + 4)-dimensional Weisfeiler-Leman algorithm identifies every graph of rank-width at most k.

Proof. Let G = (VG,EG, χG), H = (VH ,EH , χH ) be two colored graphs such that rw(G) ≤ k and G =6∼ H. Also let (T, γ) be a rank decomposition of G of width k. We prove that Spoiler wins the bijective `-pebble game played over graphs G and H where ` = 3k + 5. Together with Corollary 2.2.7 this implies the statement of the theorem. To be 38 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION more precise, it is first argued that Spoiler has a winning strategy that requires ` = 6k + 5 many pebbles. Afterwards it is explained how to realize this strategy using only 3k + 5 many pebbles making use of properties of nice split pairs. Let t ∈ V (T ) be a node of the rank decomposition. A tuple (¯a, ¯b) is called an ordered split pair for t if (¯a, ¯b) is an ordered split pair for γ(t). We describe Spoiler’s winning strategy in an inductive fashion. Throughout the play we assume that Spoiler preserves the following invariant at positions ((¯a, ¯b, v), (¯a0, ¯b0, v0)): (I.1) There is a node t ∈ V (T ) such that (¯a, ¯b) is an ordered split pair for t. (I.2) v ∈ γ(t). (I.3) Let f be the flip function obtained from Lemma 3.2.10 with respect to X = γ(t). Let C ∈ 1 ¯ 0 1 0 ¯0 Comp((G, χWL[G, a,¯ b]), f) such that v ∈ C. Similarly let C ∈ Comp((H, χWL[H, a¯ , b ]), f) such that v0 ∈ C0. Then 1 ¯ ∼ 0 1 0 ¯0 (G[C], χWL[G, a,¯ b]) =6 (H[C ], χWL[H, a¯ , b ]) (where, as before, the colorings are restricted to the vertex set of the respective graph). Note that initially it is easy for Spoiler to reach such a position for the root node r of T . Indeed, (∅, ∅) is a split pair for γ(r) = V (T ) and the flip function f obtained from Lemma 3.2.10 always evaluates to zero (i.e., no edges are flipped). So Spoiler simply may choose a pair of vertices (v, v0) satisfying Property (I.3) using Observation 3.2.17. Also observe that in a position as described above the number of pebbles is at most 2k + 1. We now prove by induction on |γ(t)| that Spoiler wins from such a position. In the base step |γ(t)| = 1. This means |C| = 1 and Spoiler easily wins using two additional pebbles. Indeed, if |C0| = 1 then Spoiler wins immediately. Otherwise |C0| > 1. In this case Spoiler wins (using two additional pebbles) since the sets C and C0 can be recognized by the Color Refinement algorithm because one of the vertices in each set is individualized (cf. Corollary 3.2.13). For the inductive step assume |γ(t)| > 1 and let t1, t2 be the children of t in the rooted tree T . Let X = γ(t), X1 = γ(t1) and X2 = γ(t2). Note that X = X1 ] X2. By Lemma 3.2.16 ¯ ¯ ¯ ¯ there are ordered split pairs (¯ai, bi) for Xi such that the triple (¯a, b), (¯a1, b1) and (¯a2, b2) is nice. On an intuitive level, the central advantage of pebbling nice triples of ordered split pairs is that, ¯ ¯ for i ∈ {1, 2}, Spoiler can remove the pebbles (¯a, b) and (¯a3−i, b3−i) without unpebbling some element from Xi. Also let fi be the flip function obtained from Lemma 3.2.10 with respect to ¯ the ordered split pair (¯ai, bi) and the set Xi. ¯ ¯ 0 ¯0 0 ¯0 Now Spoiler plays pebbles on (¯a1, b1, a¯2, b2) and let (¯a1, b1, a¯2, b2) be Duplicator’s answer. ¯ ¯ ¯ 0 0 ¯0 0 ¯0 0 ¯0 0 Letα ¯ = (¯a, b, a¯1, b1, a¯2, b2, v) andα ¯ = (¯a , b , a¯1, b1, a¯2, b2, v ) be lists of all vertices pebbled at this point in the two graphs G and H. In the next step, Spoiler wishes to play another pebble. Let σ : V (G) → V (H) be the bijection chosen by Duplicator. Without loss of generality we can assume that (a) σ(¯α) =α ¯0, and (b) σ(C) = C0 (if (a) is violated Spoiler can win immediately, if (b) is violated Spoiler also wins easily using two additional pebbles since the Color Refinement algorithm recognizes C and C0). Additionally, without loss of generality suppose that v ∈ X1 (otherwise we swap the roles of X1 and X2). 0 1 0 0 1 0 0 ∼ 0 Let G = (G[C], χWL[G, α¯]) and H = (H[C ], χWL[H, α¯ ]). Observe that G =6 H and σ 0 0 0 induces a bijection from V (G ) = C to V (H ) = C . First consider the flip function f1. By 3.2. RANK-WIDTH 39

Observation 3.2.17 and Lemma 3.2.11 there is some w ∈ C such that G0[M] =6∼ H0[M 0] where 0 0 0 0 M ∈ Comp(G , f1) such that w ∈ M and M ∈ Comp(H , f1) such that σ(w) ∈ M . Observe 0 that, formally, it is not possible to apply the flip function f1 to the graph G since the colorings do 1 1 ¯ not match. However, χWL[G, α¯]  χWL[G, a¯1, b1] and thus, the flip function f1 naturally translates to a flip function for G0 resulting in the same flipped graph. Note that M ⊆ C ⊆ X. Also note that M ⊆ X1 or M ∩ X1 = ∅ by Lemma 3.2.10.

Case 1: M ⊆ X1. 1 ¯ Let C1 ∈ Comp((G, χWL[G, a¯1, b1]), f1) be the unique set such that M = C1 ∩ C. Simi- 0 1 0 ¯0 0 0 0 larly let C1 ∈ Comp((H, χWL[H, a¯1, b1]), f1) such that M = C1 ∩ C be the corresponding 1 1 0 component in the second graph H. Note that χWL[G, α¯](u) 6= χWL[G, α¯](u ) for all u ∈ C and u0 ∈ V (G) \ C. This is clear for the graph Gf since v ∈ C and C forms a connected component in Gf and thus, it also holds for G by Corollary 3.2.13. Hence, it follows that

1 ∼ 0 1 0 (G[C1], χWL[G, α¯]) =6 (H[C1], χWL[H, α¯ ]).

0 0 Now Spoiler plays the next pebble as follows: if v ∈ C1 and v ∈ C1 then he plays z = v and z0 = v0, otherwise Spoiler plays z = w and z0 = σ(w). Clearly,

1 ∼ 0 1 0 0 (G[C1], χWL[G, α,¯ z]) =6 (H[C1], χWL[H, α¯ , z ]).

1 ¯ f1 Now consider again the flip function f1. In (G, χWL[G, a¯1, b1]) the set C1 forms a connected 1 0 ¯0 f1 0 component and similarly, in (H, χWL[H, a¯1, b1]) the set C1 forms a connected component. By Corollary 3.2.14 Spoiler can remove every pebble occupying vertices outside C1 (resp. 0 C1) while maintaining the fact that the corresponding subgraphs are non-isomorphic. Also, there is clearly no need to pebble any vertex multiple times. More formally, since α ∩ C1 ⊆ ¯ (¯a1, b1, z), it holds that

1 ¯ ∼ 0 1 0 ¯0 0 (G[C1], χWL[G, a¯1, b1, z]) =6 (H[C1], χWL[H, a¯1, b1, z ])

or Spoiler wins the game using two additional pebbles by Corollary 3.2.14. Hence, Spoiler’s ¯ ¯ 0 ¯0 0 ¯0 0 next move is to remove all pebbles (¯a, b, a¯2, b2, v) and (¯a , b , a¯2, b2, v ).

But now the invariant holds for the node t1, i.e., (I.1), (I.2) and (I.3) are satisfied for the node t1. Hence, by the induction hypothesis, Spoiler wins the game from the current position.

Case 2: M ∩ X1 = ∅, i.e., M ⊆ X2. This case is slightly more complicated since the set M is defined with respect to the flip function f1, but is contained in the set X2 which is split from the rest of the graph using the flip function f2. In this case Spoiler first plays the next pair of pebbles on the vertices w and w0 = σ(w). 1 1 0 0 Note that χWL[G, α,¯ w](u) 6= χWL[G, α,¯ w](u ) for all u ∈ M and u ∈ V (G) \ M using Corollary 3.2.13. Now Spoiler plays another pebble. Let σ0 : V (G) → V (H) be the bijection chosen by Duplicator. Without loss of generality suppose that

(a) σ0(¯α) =α ¯0, (b) σ0(w) = w0, (c) σ0(M) = M 0 40 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

(if one of the conditions is violated Spoiler wins the game using similar arguments as 00 1 00 0 1 0 0 before). Let G = (G[M], χWL[G, α,¯ w]) and H = (H[M ], χWL[G, α¯ , w ]). Observe that G00 =6∼ H00 and σ0 induces a bijection from V (G00) = M to V (H00) = M 0. Consider the flip function f2. By Observation 3.2.17 and Lemma 3.2.11 there is some z ∈ M such that 00 ∼ 00 0 00 0 00 G [N] =6 H [N ] where N ∈ Comp(G , f2) such that z ∈ N and N ∈ Comp(H , f2) such that σ0(z) ∈ N 0. 1 ¯ Observe that N ⊆ M ⊆ X2. Let C2 ∈ Comp((G, χWL[G, a¯2, b2]), f2) such that N = C2 ∩ M 0 1 0 ¯0 0 0 0 and let C2 ∈ Comp((H, χWL[H, a¯2, b2]), f2) such that N = C2 ∩ M . Then 1 ∼ 0 1 0 0 (G[C2], χWL[G, α,¯ w]) =6 (H[C2], χWL[H, α¯ , w ]).

0 0 Now Spoiler plays the next pebble as follows: if w ∈ C2 and w ∈ C2 then he plays x = w and x0 = w0, otherwise Spoiler plays x = z and x0 = σ0(z). Clearly,

1 ∼ 0 1 0 0 0 (G[C2], χWL[G, α,¯ w, x]) =6 (H[C2], χWL[H, α¯ , w , x ]). Now the argument is similar to the previous case.

1 ¯ f2 Consider again the flip function f2. In (G, χWL[G, a¯2, b2]) the set C2 forms a connected 1 0 ¯0 f2 0 component and similarly, in (H, χWL[H, a¯2, b2]) the set C2 forms a connected component. By Corollary 3.2.14 Spoiler can remove every pebble occupying vertices outside C2 (resp. 0 C2) while maintaining the fact that the corresponding subgraphs are non-isomorphic. Also, there is clearly no need to pebble any vertex multiple times. More formally, since (α, w) ∩ ¯ C2 ⊆ (¯a2, b2, x) (recall that v ∈ X1 and therefore v∈ / C2) it holds that 1 ¯ ∼ 0 1 0 ¯0 0 (G[C2], χWL[G, a¯2, b2, x]) =6 (H[C2], χWL[H, a¯2, b2, x ]) or Spoiler wins the game using two additional pebbles by Corollary 3.2.14. Hence, Spoiler’s ¯ ¯ 0 ¯0 0 ¯0 0 0 next move is to remove all pebbles (¯a, b, a¯1, b1, v, w) and (¯a , b , a¯1, b1, v , w ).

But now the invariant holds for the node t2, i.e., (I.1), (I.2) and (I.3) are satisfied for the node t2. Hence, by the induction hypothesis, Spoiler wins the game from the current position.

Overall, by the induction principle, this results in a winning strategy for Spoiler in the pebble game played over the graphs G and H. It remains to analyze the number of pebbles required to implement this strategy. Looking at Spoiler’s strategy, it is not difficult to see that it requires at most 6k + 5 many pebbles. More precisely, Spoiler needs 6k pebbles to pebble the three ordered ¯ ¯ split pairs (¯a, b) and (¯ai, bi) for i ∈ {1, 2}. The base step requires three additional pebbles. In the inductive step, five additional pebbles suffice, three for pebbling v, w and x and two pebbles to simulate the Color Refinement algorithm in case the bijections chosen by Duplicator do not match up. However, taking a closer look, some vertices are always pebbled multiple times due to the nice ordered split pairs. More precisely, from Condition (N.1) it follows thata ¯ ⊆ a¯1 ∪ a¯2. Also, ¯ ¯ ¯ Conditions (N.2) and (N.3) imply that b1 ∪ b2 ⊆ b ∪ a¯1 ∪ a¯2. So overall, the split pairs pebble at most 3k different vertices. Since there is no need to pebble any vertex multiple times, the strategy described above can actually be implemented using only 3k + 5 many pebbles. The next two corollaries state the main algorithmic consequences of Theorem 3.2.18 for the isomorphism and canonization problem for graphs of bounded rank-width. Corollary 3.2.19. The Graph Isomorphism Problem for graphs of rank-width at most k can be solved in time O(n3k+5 log n). 3.3. FURTHER RESULTS 41

Proof. This follows from Theorem 2.2.1 and 3.2.18. Corollary 3.2.20. There is an algorithm canonizing graphs of rank-width at most k in time O(n3k+7 log n). Proof. This follows from Theorem 2.2.5 and 3.2.18. Finally, observe the same results also hold for graphs of clique-width at most k by Theorem 3.2.2.

3.3 Further Results

Recall that the goal of this chapter is to identify graph classes whose Weisfeiler-Leman dimension is finite. Up to this point we have proved this for all graph classes of bounded tree-width and, more generally, for all graph classes of bounded rank-width. I finish this chapter by stating further important results in this direction. One of most well-studied graph classes is the class of planar graphs. A graph G is planar if it can be drawn in the plane without edge crossings. While the first proof that the Weisfeiler- Leman dimension of planar graphs is finite was given by Grohe [61] yielding an upper bound of 14 [134], only recently, Kiefer, Ponomarenko and Schweitzer presented an improved upper bound that is almost optimal. Proposition 3.3.1 (Kiefer,Ponomarenko,Schweitzer[90]). The 3-dimensional Weisfeiler-Leman algorithm identifies every planar graph. Indeed, it is well known that the Color Refinement algorithm fails to identify every planar graph which means that the Weisfeiler-Leman dimension of the class of planar graphs is either two or three. The last result is generalized by Grohe and Kiefer to graphs of bounded genus. The genus of a graph G is the smallest number g such that G is embeddable on a surface of Euler genus at most g. Note that planar graphs have Euler genus 0. Proposition 3.3.2 (Grohe, Kiefer [65]). The (4g + 3)-dimensional Weisfeiler-Leman algorithm identifies every graph of Euler genus at most g. Let G be a graph. A graph H is a minor of G is H can be obtained from G by deleting vertices, deleting edges, and contradicting edges. A graph class C excludes H as a minor if no graph G ∈ C has H as a minor. A graph class is closed under minors if for every G ∈ C and every minor H of G it holds that H ∈ C. Observe that every graph class that is closed under minors and does not contain every graph excludes some fixed graph as a minor. For example, this includes planar graphs and more generally graph classes of bounded genus. Moreover, this also includes every graph class of bounded tree-width. Proposition 3.3.3 (Grohe [63]). Let C be a graph class that excludes a fixed graph as a minor. Then there exists some k ∈ N such that the k-dimensional Weisfeiler-Leman algorithm identifies every graph G ∈ C. In particular, every graph class that excludes a fixed graph as a minor admits a polynomial time graph isomorphism test. The last proposition provides a large collection of graph classes that have finite Weisfeiler-Leman dimension. However, every graph class that excludes a fixed graph as a minor only contains graphs that have a linear number of edges. 42 CHAPTER 3. UPPER BOUNDS ON THE WL DIMENSION

A collection of dense classes that have finite Weisfeiler-Leman dimension are classes of bounded rank-width (see Theorem 3.2.18). Besides this result, there are only few examples of graph classes also containing dense graphs that have finite Weisfeiler-Leman dimension. One such example are interval graphs, intersection graphs of intervals on the real line. More formally, a graph G is an interval graph if for every v ∈ V (G) there is an interval Iv = [i, j] := {i, . . . , j} ⊆ N such that vw ∈ E(G) if and only if Iv ∩ Iw 6= ∅. Proposition 3.3.4 (Evdokimov, Ponomarenko,Tinhofer[51, 50]). The 2-dimensional Weisfeiler- Leman algorithm identifies every interval graph. Chapter 4

Lower Bounds

In the previous chapter several positive results have been presented bounding the Weisfeiler- Leman dimension of certain graph classes. In this chapter, these results are complemented by analyzing the limits of purely combinatorial approaches to the Graph Isomorphism Problem. Towards this end, we first analyze the Weisfeiler-Leman algorithm and give lower bounds on the Weisfeiler-Leman dimension for graphs of bounded tree-width and rank-width that are only a small constant factor away from the upper bounds presented before. Both lower bounds are based on a well-known construction of Cai, F¨urerand Immerman [31] showing the Weisfeiler-Leman algorithm fails to decide the Graph Isomorphism Problem unless its dimension is linear in the number of vertices of the input graphs. Moreover, we also present exponential lower bounds on the worst-case complexity of algorithms within the individualization-refinement framework.

4.1 Weisfeiler-Leman Algorithm

We start by presenting lower bounds on the Weisfeiler-Leman dimension. More precisely, we continue the analysis of the Weisfeiler-Leman dimension of graphs bounded tree-width and rank- width which started in the previous chapter by presenting meaningful upper bounds. Towards this end, we start by reviewing the celebrated result of Cai, F¨urerand Immerman stating the k-dimensional Weisfeiler-Leman algorithm fails to decide isomorphism of n-vertex graphs unless k = Ω(n).

Theorem 4.1.1 (Cai, F¨urer,Immerman [31]). For every natural number k ≥ 1 there are non- isomorphic 3-regular graphs Gk and Hk such that |V (Gk)| = |V (Hk)| = O(k) and Gk 'k Hk.

Since the tree-width of every n-vertex graph is upper bounded by n this result already implies that the Weisfeiler-Leman dimension of the class of graphs of tree-width at most k is in Ω(k). Although this result is optimal up to a constant factor it does not give us a meaningful bound on the constants involved. Indeed, the tree-width of a graph often is much smaller than the number of vertices of a graph which indicates more meaningful bounds might be possible. Towards this end, we first review the Cai-F¨urer-Immermanconstruction which forms the basis of the Theorem 4.1.1.

The Cai-F¨urer-Immerman Gadget. For a non-empty finite set S we define the CFI gadget XS to be the following graph. For each w ∈ S there are vertices a(w) and b(w) and for every A ⊆ S with |A| even there is a vertex mA. For every A ⊆ S with |A| even there are edges

43 44 CHAPTER 4. LOWER BOUNDS

b(3)

m∅ a(3)

m{2,3} b(1)

m{1,3} a(1)

m{1,2} b(2)

a(2)

Figure 4.1: Cai-F¨urer-Immermangadget X3

{a(w), mA} ∈ E(XS) for all w ∈ A and {b(w), mA} ∈ E(XS) for all w ∈ S \ A. As an example the graph X3 := X[3] is depicted in Figure 4.1. The graph is colored so that {a(w), b(w)} forms a color class for each w and so that {mA | A ⊆ S and |A| even} forms a color class. Let X ⊆ S and γ ∈ Aut(XS). We say that γ swaps exactly the pairs of X if γ(a(w)) = b(w) for w ∈ X and γ(a(w)) = a(w) for w ∈ S \ X.

Lemma 4.1.2 ([31]). Let X ⊆ S. Then there is an automorphism γ ∈ Aut(XS) swapping exactly the pairs of X if and only if |X| is even. Additionally, if such an automorphism exists, it is unique.

The Cai-F¨urer-ImmermanGraphs. Let G be a connected graph of minimum degree two, i.e., deg(v) ≥ 2 for all v ∈ V (G). For T ⊆ E(G) we define the graph CFIT (G) to be the graph obtained from G in the following way. Each v ∈ V (G) is replaced by a gadget XE(v) where E(v) := {(v, w) | vw ∈ E(G)} denotes the set of (directed) edges incident to v. Additionally, the following edges are added between the gadgets. For each vw ∈ E(G) \ T there are edges from a(v, w) to a(w, v) and from b(v, w) to b(w, v). Also, for every vw ∈ T there are edges from a(v, w) to b(w, v) and from b(v, w) to a(w, v).

Lemma 4.1.3 ([31]). Let G be a connected graph of minimum degree two and S, T ⊆ E(G). ∼ Then CFIS(G) = CFIT (G) if and only if |S| ≡ |T | mod 2. Hence, applying the above construction to a specific graph G yields a pair of non-isomorphic graphs CFI(G) := CFI∅(G) and ]CFI(G) := CFI{e}(G) for some e ∈ E(G). It is this pair of non-isomorphic graphs which cannot be distinguished by the k-dimensional Weisfeiler-Leman algorithm for a suitable choice of the base graph G. More precisely, in [31] the authors prove that if G has no separator S of size k + 1 such that every component of G − S has at most |V |/2 vertices, then CFI(G) 'k ]CFI(G). The existence of 3-regular graphs of this type with a linear number of vertices follows from the existence of 3-regular expander graphs (see, e.g., [1]). In combination, this proves Theorem 4.1.1. The above already provides a useful sufficient condition for the base graph G in order to fulfill

CFI(G) 'k ]CFI(G). However, a more rigorous analysis reveals that it actually suffices for the graph G to have tree-width at least k + 1. 4.1. WEISFEILER-LEMAN ALGORITHM 45

Theorem 4.1.4 (Dawar, Richerby [43]). Let G be a connected graph such that tw(G) ≥ k + 1 and deg(v) ≥ 2 for all v ∈ V (G). Then CFI(G) 'k ]CFI(G).

This theorem immediately implies that the Weisfeiler-Leman dimension of the class of graphs k of tree-width at most k is strictly greater than 10 − 1. Indeed, for a graph G of tree-width k and maximum degree d it is easy to see that tw(CFI(G)) ≤ tw(G) · (2d + 2d−1) by replacing every vertex v ∈ V (G) in a tree decomposition of G by the vertices from the gadget XE(v) (see also [43, Lemma 5]). Since there exist connected 3-regular graphs of arbitrary tree-width this implies the stated lower bound. However, this lower bound is still pretty far away from the upper bound derived in the previous chapter. In the following an improved bound is presented by providing a better analysis of the tree-width of CFI(G) for certain base graphs G. Indeed, based on the last theorem, it suffices to find graphs G of tree-width k + 1 such that one can find a good upper bound on the tree-width of CFI(G) and ]CFI(G). A natural and well-known candidate for a graph of tree-width k is the k × k grid. + For k ≥ 2 let Gk,k denote the k × k grid. Moreover let Gk,k be the k × k grid where each edge is subdivided twice. Formally,

V (Gk,k) := [k] × [k] and 0 0 0 0 0 0 E(Gk,k) := {(i, j)(i , j ) | (i = i ∧ |j − j | = 1) ∨ (j = j ∧ |i − i | = 1)}. Moreover + V (Gk,k) := V (Gk,k) ∪ {(v, w) | vw ∈ E(Gk,k)} and

+ E(Gk,k) := {v(v, w) | v ∈ V (Gk,k), vw ∈ E(Gk,k)} ∪ {(v, w)(w, v) | vw ∈ E(Gk,k)}.

It is well-known that the tree-width of a k×k grid is tw(Gk,k) = k. Also, using essentially the + same tree decomposition, it holds that tw(Gk,k) = k. However, for the aim of bounding the tree- width of the graphs CFI(Gk,k) and ]CFI(Gk,k) the first step is to construct a tree decomposition + for Gk,k satisfying some additional properties. The main intuition is that each subdivision vertex (i.e. the vertices that are added for subdividing edges) is replaced by two vertices in the CFI- construction. On the other hand, the original vertices of the grid are typically replaced by eight vertices (assuming the vertex has degree 4). Hence, the goal is to find a tree decomposition + of Gk,k where the large bags only contain subdivision vertices. This way, when building a tree decomposition for CFI(G) in the natural way from a tree decomposition for G, the size of the largest bag only increases by a factor of two.

+ Lemma 4.1.5. Let k ≥ 2. Then there is a tree decomposition (T, β) of Gk,k of width k + 2 such that

1. |β(t) ∩ V (Gk,k)| ≤ 1 for every t ∈ V (T ), and

2. if |β(t)∩V (Gk,k)| = 1 then β(t) = E(v)∪{v} for some v ∈ V (Gk,k) where E(v) = {(v, w) | vw ∈ E(Gk,k)}. In this case t is a leaf of T and β(s)∩V (Gk,k) = ∅ for the unique s ∈ V (T ) with st ∈ E(T ). 46 CHAPTER 4. LOWER BOUNDS

j − 1 j j − 1 j j − 1 j

i

i + 1

Ai,j Bi,j Ci,j

Figure 4.2: Visualization of the sets Ai,j, Bi,j and Ci,j constructed in the proof of Lemma 4.1.5.

Proof. In order to describe the bags of the tree decomposition several sets Ai,j,Bi,j,Ci,j ⊆ + V (Gk,k) for i, j ∈ [k] are defined first (see also Figure 4.2). Let

0 0 0 Ai,j := {((i , j), (i , j + 1)) | 1 ≤ i ≤ i} ∪ {((i0, j), (i0, j − 1)) | i ≤ i0 ≤ k} ∪ {((i, j), (i + 1, j)), ((i, j), (i − 1, j))},

0 0 0 Bi,j := {((i , j), (i , j + 1)) | 1 ≤ i ≤ i} ∪ {((i0, j), (i0, j − 1)) | i < i0 ≤ k} ∪ {((i, j), (i + 1, j)), ((i + 1, j), (i, j))} and

0 0 0 Ci,j := {((i , j), (i , j − 1)) | 1 ≤ i ≤ i} ∪ {((i0, j − 1), (i0, j)) | i ≤ i0 ≤ k}.

+ (Formally, the sets defined above may also contain elements outside of V (Gk,k) if some index is not contained in the set [k]. In this case, the corresponding element is simply not part of the set.) Now define A B C D V (T ) := {ti,j, ti,j, ti,j, ti,j | i, j ∈ [k]} Also set

A β(ti,j) := Ai,j, B β(ti,j) := Bi,j, C β(ti,j) := Ci,j, 4.1. WEISFEILER-LEMAN ALGORITHM 47

D β(ti,j) := E(i, j) ∪ {(i, j)}. Observe that each bag contains at most k + 3 many elements. It remains to define the edges of the tree T . The following edges are added to the set E(T ):

C C • ti,jti+1,j for all i ∈ [k − 1], j ∈ [k],

C A • tk,jt1,j for all j ∈ [k],

A B • ti,jti,j for all i, j ∈ [k],

A D • ti,jti,j for all i, j ∈ [k],

B A • ti,jti+1,j for all i ∈ [k − 1], j ∈ [k], and

B C • tk,jt1,j+1 for all j ∈ [k − 1]. + It can be easily verified that (T, β) defines a tree decomposition of Gk,k with the desired prop- erties.

Lemma 4.1.6. For k ≥ 2 it holds that tw(CFI(Gk,k)) ≤ 2k + 5 and tw(]CFI(Gk,k)) ≤ 2k + 5. Proof. Fix k ≥ 2 and let (T, β) be the tree decomposition described in Lemma 4.1.5 for the + 0 0 graph Gk,k. Now a tree decomposition (T , β ) for the graphs CFI(Gk,k) and ]CFI(Gk,k) can be 0 obtained as follows. For each t ∈ V (T ) such that β(t) ∩ V (Gk,k) = ∅ it also holds that t ∈ V (T ) and β0(t) = {a(v, w), b(v, w) | (v, w) ∈ β(t)}. 0 Note that |β (t)| = 2 · |β(t)|. Also, for t1t2 ∈ E(T ) where β(ti) ∩ V (Gk,k) = ∅ there is an edge 0 t1t2 ∈ E(T ). Otherwise |β(t) ∩ V (Gk,k)| = 1 and β(t) = E(v) ∪ {v} for some v ∈ V (Gk,k). Also t is a leaf of T and β(s) ∩ V (Gk,k) = ∅ for the unique s ∈ V (T ) with st ∈ E(T ). For every A ⊆ E(v) with 0 even cardinality |A| there is a node tA ∈ V (T ). We define

0 β (tA) = {mA} ∪ {a(v, w), b(v, w) | vw ∈ E(Gk,k)}.

Note that |β0(t )| ≤ 9 since deg (v) ≤ 4 for every v ∈ V (G ). Also, there are edges A Gk,k k,k 0 0 0 tAs ∈ E(T ) for every A ⊆ E(v) with even cardinality |A|. It is easy to check that (T , β ) is a tree decomposition for the graphs CFI(Gk,k) and ]CFI(Gk,k). Also,

width(T 0, β0) ≤ max{9, 2(width(T, β) + 1)} − 1 ≤ max{9, 2(k + 3)} − 1 = 2k + 5.

Theorem 4.1.7. For every k ≥ 2 there are non-isomorphic graphs Gk and Hk of tree-width at most 2k + 7 such that Gk 'k Hk.

Proof. Let Gk = CFI(Gk+1,k+1) and Hk = ]CFI(Gk+1,k+1). Then the statement follows from Theorem 4.1.4 and Lemma 4.1.6. In combination with Theorem 3.2.3 this result also implies a similar statement for graphs of bounded rank-width.

Corollary 4.1.8. For every k ≥ 2 there are non-isomorphic graphs Gk and Hk of rank-width at most 2k + 8 such that Gk 'k Hk. 48 CHAPTER 4. LOWER BOUNDS

Weisfeiler-Leman dimension graph class lower bound upper bound trees 1 1 planar graphs 2 3 k tree-width k 2 − 3 k genus g Ω(g) 4g + 3 excluded minor H Ω(|V (H)|) f(H) interval graphs 2 2 k clique-width k 2 − 6 3k + 4 k rank-width k 2 − 4 3k + 4 Table 4.1: Upper and lower bounds on the Weisfeiler-Leman dimension of certain graph classes.

Another graph measure that is briefly mentioned in the previous chapter is clique-width. Since the rank-width of a graph is bounded by its clique-width (see Theorem 3.2.2) all upper bounds stated in the previous chapter for rank-width immediately translate to graphs of bounded clique-width. Of course this raises the question whether a lower bound similar to the ones given above can also be obtained for clique-width. First it can be observed that such a result cannot be obtained directly. Indeed, while the clique-width of a graph is bounded in terms of its tree- width the clique-width may be exponentially larger than the tree-width of a graph [39]. However, an analysis of the above graphs reveals that similar bounds can still be proven with respect to clique-width.

Proposition 4.1.9. For k ≥ 2 it holds that cw(CFI(Gk,k)) ≤ 2k + 11 and cw(]CFI(Gk,k)) ≤ 2k + 11.

Proof Idea. The strategy is to build t-expressions for the graphs CFI(Gk,k) and CFI(Gk,k) from “left to right” similar to the tree decompositions constructed before. All vertices on the “border” of the current step in the construction get distinct colors assigned whereas all remaining vertices are assigned the same color. The number of vertices on the “border” is at most 2k + 2 + 8 which means that in total 2k + 11 colors suffice.

Corollary 4.1.10. For every k ≥ 2 there are non-isomorphic graphs Gk and Hk of clique-width at most 2k + 13 such that Gk 'k Hk. An overview of the upper and lower bounds discussed in this thesis is again given in Table 4.1. Observe that the lower bounds not stated above either directly follow from Theorem 4.1.1 or they follow from the fact the 6-cycle and the disjoint union of two triangles cannot be distinguished by the Color Refinement algorithm. Of course, as a natural open problem, it would be desirable to further close the gaps between upper and lower bounds for any of these graph classes.

4.2 The I/R-Method in Theory

Another combinatorial approach to graph isomorphism testing, that works extremely well in practice (see, e.g., [113]), is provided by the individualization-refinement paradigm (see Section 2.3). The goal of this section is to provide a theoretical analysis of this paradigm proving that algorithms in this framework have exponential worst-case complexity. The first analysis of 4.2. THE I/R-METHOD IN THEORY 49 algorithms in this framework was given by Miyzaki [118] in 1995 showing that the then current version of Nauty [112] has exponential worst-case complexity. However, for his analysis, Miyazaki exploited specific implementation details, for example regarding the choice of the cell selector implemented in Nauty. Indeed, as Miyazaki also argues, the graphs constructed to prove his lower bound can be canonized in polynomial time using the I/R framework. Compared to the results of the last section, a main obstacle for proving lower bounds within the I/R paradigm is the possibility of the algorithms to prune the search tree using automor- phisms of the input graphs. Indeed, the Cai-F¨urer-Immermangraphs considered above have an exponential number of automorphisms allowing I/R algorithms to prune large parts of the search tree. As a result, I/R algorithms perform reasonably well on Cai-F¨urer-Immermangraphs (see [113, 118]). In order to circumvent this problem the proofs of this section are based on a construction of Gurevich and Shelah [75] yielding rigid graphs (i.e., graphs without non-trivial automorphisms) with similar properties. The results presented in this section are also given in [124].

4.2.1 A Framework for a Lower Bound

The goal is to make a comprehensive statement about the worst-case complexity of individu- alization-refinement algorithms. In particular, the goal is to provide lower bounds that do not depend on the specific choices of the cell selector, the refinement operator, and the node invariant implemented in an I/R algorithm (see Section 2.3). However, there is an intrinsic limitation here. A complete node invariant that distinguishes any two non-isomorphic graphs would yield a polynomial-size search tree. Similarly, a refinement operator that refines every coloring into the orbit partition under the automorphism group of the graph also yields polynomial-size search trees. However, it is not difficult to show that being able to compute either of these functions is at least as hard as the Graph Isomorphism Problem itself. Of course it is nonsensical to allow that an individualization-refinement algorithm uses a subroutine that already solves the Graph Isomorphism Problem. Thus, it becomes apparent that we need to restrict the power of the operators involved in building an I/R algorithm. One possible way to achieve such a restriction is by using the Weisfeiler-Leman algorithm. From a theoretical perspective, the Weisfeiler-Leman algorithm provides a powerful tool which already identifies many different types of graphs (see Chapter 3), while at the same, it fails to decide the Graph Isomorphism Problem in general (see Theorem 4.1.1). Moreover, from a practical point of view, all operators used in practical implementations (e.g., Nauty/Traces [112, 113], Bliss [85, 86], Conauto [105], etc.) are based on the Weisfeiler- Leman algorithm (actually, in most cases, they are based on the Color Refinement algorithm). Hence, any lower bound on the worst-case complexity of I/R algorithms based on operators using the Weisfeiler-Leman algorithm in particular implies the same bound for all the state-of-the-art tools used in practice. This makes the Weisfeiler-Leman algorithm a natural choice for limiting the power of the cell selection, the refinement operator, and the node invariants implemented in an I/R algorithm. To formalize the restriction on the operators we introduce the notion of k-realizability. Let (G, χG), (H, χH ) be two colored graphs. A cell selector sel is k-realizable if sel(G, χG) = ` ` sel(H, χH ) whenever (G, χG) 'k (H, χH ). Also letv ¯ ∈ V (G) andw ¯ ∈ V (H) . A node in- variant inv is k-realizable if inv(G, χG, v¯) = inv(H, χH , w¯) whenever (G, χG, v¯) 'k (H, χH , w¯). Intuitively this means that whenever the k-dimensional Weisfeiler-Leman algorithm cannot dis- tinguish between the graphs associated with two nodes of the refinement tree then the cell selector and the node invariant have to behave in the same way on both nodes. Finally, a refinement 50 CHAPTER 4. LOWER BOUNDS

1 operator ref is k-realizable if, for all colored graphs (G, χG) and (H, χH ), and all v ∈ V (G), w ∈ V (H), it holds that

k k χWL[G, χG](v, . . . , v) = χWL[H, χH ](w, . . . , w) k k ⇒ χWL[G, ref(G, χG)](v, . . . , v) = χWL[H, ref(H, χH )](w, . . . , w).

Observe that, in particular, this implies that (G, ref(G, χG)) 'k (H, ref(H, χH )) if (G, χG) 'k (H, χH ). For the remainder of this section we restrict our attention to I/R algorithms implementing k-realizable operators for some fixed number k. Actually, we only use the k-realizability of the operators through the following lemma.

Lemma 4.2.1. Suppose k ∈ N and let sel be a k-realizable cell selector, inv a k-realizable node invariant and ref a k-realizable refinement operator. Furthermore, let (G, χG) be a colored graph ref,sel ref,sel and suppose v¯ ∈ V (Tinv [G, χG]). Let m = |v¯|. Then w¯ ∈ V (Tinv [G, χG]) for every w¯ ∈ m V (G) such that (G, χG, v¯) 'k (G, χG, w¯).

Proof. Supposev ¯ = (v1, . . . , vm) andw ¯ = (w1, . . . , wm). The statement is proved by induction on m ∈ N. For m = 0 the statement trivially holds sincev ¯ =w ¯ = ε. So suppose m > 0. Let 0 v¯ := (v1, . . . , vm−1) be the tuple obtained fromv ¯ by deleting the last entry and similarly define 0 0 0 0 ref,sel w¯ := (w1, . . . , wm−1). Clearly, (G, χG, v¯ ) 'k (G, χG, w¯ ) andv ¯ ∈ V (Tinv [G, χG]). So by 0 ref,sel 0 induction hypothesis it follows thatw ¯ ∈ V (Tinv [G, χG]). Let χ1 := ref(G, χG, v¯ ) and χ2 := 0 ref(G, χG, w¯ ). Then (G, χ1) 'k (G, χ2) since ref is k-realizable. This implies that sel(G, χ1) = −1 sel(G, χ2) =: i because sel is k-realizable. Note that vm ∈ χ1 (i). Since (G, χG, v¯) 'k (G, χG, w¯) k 0 k 0 it holds that χWL[G, χG, v¯ ](vm, . . . , vm) = χWL[H, χH , w¯ ](wm, . . . , wm). Hence, i = χ1(vm) = ref,sel χ2(wm) because ref is k-realizable. Thus,w ¯ ∈ V (T [G, χG]). Furthermore, inv(G, χG, v¯) = ref,sel inv(G, χG, w¯) which implies thatw ¯ ∈ V (Tinv [G, χG]). In order to provide lower bounds on the worst-case complexity of I/R algorithms using k- ref,sel realizable operators we construct rigid colored graphs (G, χG) whose search tree Tinv [G, χG] is exponentially large. Since the graphs are rigid no automorphism pruning can speed up the algorithm (see Proposition 2.3.1). In the light of the above lemma, to prove that the tree ref,sel ref,sel Tinv [G, χG] is exponentially large, it suffices to find a nodev ¯ ∈ V (Tinv [G, χG]) with an exponential number of equivalent tuples. To argue the existence of these equivalent tuples we roughly proceed in two steps. First, we show that the depth of the search tree is linear, i.e., to obtain a discrete partition one has to individualize a linear number of vertices. This means there ref,sel is a nodev ¯ in the search tree Tinv [G, χG] such that |v¯| is linear in the number of vertices of G. Then, in a second step, we show that if |v¯| is sufficiently large, there are exponentially many equivalent tuples. To find such equivalent tuples we prove a limitation of the effect of the k- dimensional Weisfeiler-Leman algorithm after individualizing the vertices fromv ¯. Intuitively we identify a subgraph containingv ¯ which encapsulates the effect of the Weisfeiler-Leman algorithm. This subgraph has an exponential number of automorphisms which enables us to find the desired number of equivalent tuples.

4.2.2 The Multipede Construction The proof of the lower bound is based on the construction of graphs R(G) defined for a bipartite base graph G. This construction is a combination of the Cai-F¨urer-Immerman construction and

1The definition provided in the original work [124] is not sufficient to prove exponential lower bounds on the size of the search of an I/R algorithm. To resolve this error a slightly stronger notion of k-realizability is required. 4.2. THE I/R-METHOD IN THEORY 51

w1 w2 w3 w4 w5 w6 W

G

V v1 v2 v3

a(w1) b(w1) a(w2) b(w2) a(w3) b(w3) a(w4) b(w4) a(w5) b(w5) a(w6) b(w6)

R(G)

Figure 4.3: The figure depicts a base graph G on the top and the corresponding multipede graph R(G) on the bottom. a related construction of Gurevich and Shelah of multipedes [75] yielding rigid structures with similar properties. Let G = (V, W, E) be a bipartite graph where deg(v) ≥ 2 for all v ∈ V . The multipede graph R(G) is defined as follows. Each vertex w ∈ W is replaced by two vertices a(w) and b(w). Also, each v ∈ V is replaced by the CFI gadget XN(v) (see Section 4.1). The middle vertices of the CFI-gadget are denoted by mA(v) for A ⊆ N(v) with |A| even. More formally,

V (R(G)) := {a(w), b(w) | w ∈ W } ∪ {mA(v) | v ∈ V,A ⊆ N(v), |A| even} and E(R(G)) := {a(w)mA(v) | w ∈ A} ∪ {b(w)mA(v) | w ∈ N(v) \ A}. 2 S For each w ∈ W denote F (w) := {a(w), b(w)} and for X ⊆ W define F (X) := w∈X F (w). Also, for v ∈ V denote M(v) := {mA(v) | A ⊆ N(v), |A| even} and for Y ⊆ V define M(Y ) := S v∈Y M(v). The vertices of the graph R(G) are colored in such a way that F (w) forms a color class for all w ∈ W and moreover, M(v) forms a color class for all v ∈ V . Formally, the coloring χR(G) : V (R(G)) → C of the graph R(G) can be defined as ( w if u ∈ F (w) χ (u) := . R(G) v if u ∈ M(v)

An example of this construction is shown in Figure 4.3. For I ⊆ W we further define the graph RI (G) similar to R(G) but refine the coloring so that for each w ∈ I both {a(w)} and {b(w)} form a color class. Hence, R(G) = R∅(G). Note that the multipede construction is closely related to the Cai-F¨urer-Immerman construc- tion discussed in the previous section. Indeed, let H be an arbitrary graph and let G be the

2This notation is inspired by the work of Gurevich and Shelah [75] where a(w) and b(w) are called the feet of w. 52 CHAPTER 4. LOWER BOUNDS bipartite graph obtained from H by subdividing each edge exactly once (i.e., each edge e ∈ E(H) is replaced by a new vertex that is connected to the two endpoints of the edge e). Then, the only difference between CFI(H) and R(G) is that, in CFI(H) there are two pairs of vertices (a(v, w), b(v, w)) and (a(w, v), b(w, v)) whereas in R(G) there is only one pair (a(vw), b(vw)). However, this does not change any of the relevant properties of the graph CFI(H). In this way, the multipede construction can be seen as a generalization of the Cai-F¨urer-Immerman construction. Recall that, in order for the I/R algorithm to be unable to exploit automorphisms of the input graphs, the graph R(G) is supposed to be rigid. We start by identifying properties of G that correspond to R(G) having few automorphisms. Definition 4.2.2 (Odd Graphs, Gurevich and Shelah [75]). Let G = (V, W, E) be a bipartite graph. We say G is odd if for every ∅= 6 X ⊆ W there exists some v ∈ V such that |N(v) ∩ X| is odd. Lemma 4.2.3. Let G = (V, W, E) be an odd bipartite graph. Then R(G) is rigid. Proof. Let γ ∈ Aut(R(G)) be an automorphism of the graph R(G). Due to the coloring of the vertices, the permutation γ maps every set F (w) for w ∈ W and every set M(v) for v ∈ V to itself. Consider the set X := {w ∈ W | γ(a(w)) = b(w)}. Suppose towards a contradiction that X 6= ∅. Since G is odd there is some v ∈ V such that |N(v) ∩ X| is odd. Then γ restricts to an automorphism of the gadget XN(v) swapping an odd number of the outer pairs. This contradicts the properties of CFI gadgets (cf. Lemma 4.1.2). So X = ∅ and thus γ(a(w)) = a(w) for all w ∈ W . From this it easily follows that γ is the identity mapping (cf. Lemma 4.1.2). Remark 4.2.4. As indicated above, for each Cai-F¨urer-Immermangraph there is a corresponding multipede graph R(G) for a base graph G = (V, W, E) where deg(w) = 2 for every w ∈ W . Observe that such a graph G cannot be odd. Indeed, viewing the set W as the edge set of a graph H defined on vertex set V , the edge set X ⊆ W of an arbitrary cycle provides a witness for G not being odd. Hence, the generalization from Cai-F¨urer-Immermangraphs to multipede graphs in particular allows for the construction of rigid graphs. Actually, in our proof, we consider graphs G such that R(G) is not rigid, but only has few automorphisms. In order to turn R(G) into a rigid graph we individualize a small set of vertices by considering the graph RI (G) for a small set I. It turns out that the number of automorphisms of R(G) and the number of vertices that need to be individualized can be computed from the rank of the adjacency matrix of G. n×n Recall that for a graph G we denote by AG ∈ F2 its adjacency matrix. For a bipartite ∗ V ×W graph G = (V, W, E) let AG := AG[V,W ] ∈ F2 be the submatrix with rows from V and columns from W . Also recall that rk2(A) denotes the F2-rank of a matrix A. Finally, we denote by AT the transpose of a matrix A. |W |−rk (A∗ ) Lemma 4.2.5. Let G be a bipartite graph. Then | Aut(R(G))| = 2 2 G . m×n n Proof. Suppose G = (V, W, E). For a matrix A ∈ F2 denote Sol(A) := {x ∈ F2 | Ax = 0}. To ∗ show the lemma it suffices to argue that | Aut(R(G))| = | Sol(AG)|. W For γ ∈ Aut(R(G)) we define the vector xγ ∈ F2 by setting (xγ )w = 1 if and only if γ(a(w)) = b(w). Observe that the mapping γ 7→ xγ is injective. Furthermore AGxγ = 0 since for each v ∈ V the automorphism γ swaps an even number of neighbors of v. For the backward direction let x ∈ Sol(AG). Then, for each v ∈ V , the set {w ∈ N(v) | xw = 1} has even cardinality. Thus, by the properties of the CFI-gadgets (cf. Lemma 4.1.2), there is 4.2. THE I/R-METHOD IN THEORY 53 a unique automorphism γ ∈ Aut(R(G)) that swaps exactly those pairs (a(w), b(w)) for which xw = 1.

In particular, the arguments show a bipartite graph G = (V, W, E) is odd if and only if ∗ rk2(AG) = |W |.

Corollary 4.2.6. Let G = (V, W, E) be an odd bipartite graph. Then there is some V 0 ⊆ V with |V 0| ≤ |W | such that the induced subgraph G[V 0 ∪ W ] is odd.

Lemma 4.2.7. Let G = (V, W, E) be a bipartite graph. Then there is I ⊆ W with |I| ≤ ∗ I |W | − rk2(AG) such that R (G) is rigid.

W W Proof. Let B = {ew ∈ F2 | w ∈ W } be the standard basis for F2 (that is, (ew)u = 1 if and ∗ ∗ only if w = u). Furthermore, for v ∈ V , let (AG)v be the v-th row of AG and let BI ⊆ B be a ∗ T W minimal subset of B such that BI ∪ {((AG)v) | v ∈ V } spans the entire space F2 . Finally, let ∗ I = {w ∈ W | ew ∈ BI }. Clearly, |I| ≤ |W | − rk2(AG). I I W It remains to argue that R (G) is rigid. Let γ ∈ Aut(R (G)) and let xγ ∈ F2 be the vector T obtained by setting (xγ )w = 1 if and only if γ(a(w)) = b(w). Then (ew) xγ = 0 for all w ∈ I. Furthermore (AG)vxγ = 0 for all v ∈ V by the same argument as in the proof of Lemma 4.2.5. T W Since BI ∪{((AG)v) | v ∈ V } spans the entire space F2 it follows by the standard linear algebra arguments that xγ = 0. Thus γ is the identity mapping.

4.2.3 The Weisfeiler-Leman Refinement and Closure Operators Recall that, following the strategy outlined below Lemma 4.2.1, the goal is to find many tuples of verticesv ¯ ∈ V (R(G))m such that individualizing these vertices results in equivalent graphs (with respect to the k-dimensional Weisfeiler-Leman algorithm). Towards this end, we define a closure operator that bounds the effect of the Weisfeiler-Leman refinement after individualizing the vertices fromv ¯.

Definition 4.2.8 (d-Closure). Let d ∈ N and let G = (V, W, E) be a bipartite graph. For X ⊆ W define the d-attractor of X as [ attrd(X) = X ∪ N(v). v∈V : |N(v)\X|≤d

A set X ⊆ W is d-closed if X = attrd(X). The d-closure of X is the unique minimal superset which is d-closed, that is d \ 0 clG(X) = X . X0⊇X, X0 is d-closed

As observed in [75] the 1-closure describes the information the 1-dimensional Weisfeiler-Leman captures.

Lemma 4.2.9 (Gurevich, Shelah [75]). Let G = (V, W, E) be a bipartite graph and suppose I ⊆ W . Then 1 I 1 I 1 χWL[R (G)](a(w)) 6= χWL[R (G)](b(w)) ⇔ w ∈ clG(I) for all w ∈ W . 54 CHAPTER 4. LOWER BOUNDS

Let G = (V, W, E) be a bipartite graph. Slightly abusing notation, for a set X ⊆ W define N −1(X) := {v ∈ V | N(v) ⊆ X}. For X ⊆ W define R(G)[[X]] := R(G)[F (X) ∪ M(N −1(X))]. The last lemma can be used to show that for a 1-closed set X ⊆ W and a sequence of m verticesx ¯ = (x1, . . . , xm) ∈ F (X) , for every automorphism ϕ ∈ Aut(R(G)[[X]]) it holds that (R(G), x¯) '1 (R(G), ϕ(¯x)). The 1-closure thus gives us a method to find tuples that cannot be distinguished by the 1-dimensional Weisfeiler-Leman algorithm. However, we require such a statement also for higher dimensions. Obtaining a similar statement characterizing the effect of the k-dimensional Weisfeiler-Leman seems to be much more intricate and it is easy to see that the d-closure does not achieve this. However, under some additional assumptions, it still allows us to bound the effect of k-dimensional Weisfeiler-Leman which is sufficient for our purposes.

Lemma 4.2.10. Let k, d ∈ N and suppose d ≥ k. Let G = (V, W, E) be a bipartite graph and X = {w1, . . . , wm} ⊆ W be a d-closed set. Furthermore suppose that  

[ N(v) ∩  N(vi) ≤ d − k

i∈[k] for all distinct v, v1, . . . , vk ∈ V . Let x¯ = (x1, . . . , xm) be a sequence of vertices with xi ∈ F (wi) and let ϕ ∈ Aut(R(G)[[X]]). Then (R(G), x¯) 'k (R(G), ϕ(¯x)).

Proof. By Corollary 2.2.7 it suffices to prove that Duplicator has a winning strategy in the bijec- tive (k + 1)-pebble game BPk+1((R(G), x¯), (R(G), ϕ(¯x))) played on (R(G), x¯) and (R(G), ϕ(¯x)). Towards this end we say a vertex v ∈ V (respectively w ∈ W ) is pebbled if there exists a ∈ M(v) (respectively a ∈ F (w)) which is pebbled. Furthermore we say that a vertex w ∈ W is fixed if there is some pebbled v ∈ N(w) (note that v ∈ V in this case). For a tuplea ¯ ∈ V (R(G))≤k of length at most k of pebbled vertices let

cl(¯a) = F (X) ∪ M(N −1(X)) ∪ {M(v) | v ∈ V : v is pebbled} ∪ {F (w) | w ∈ W : w is pebbled or fixed}.

Now during the play Duplicator preserves the following invariant for positions (¯a, ¯b) ∈ V (R(G))`× V (R(G))` where ` ≤ k + 1.

(I) There is an isomorphism α: R(G)[cl(¯a)] =∼ R(G)[cl(¯b)] such that α(¯x) = ϕ(¯x) and α(¯a) = ¯b.

Observe that α extends ϕ, that is, α(u) = ϕ(u) for all u ∈ V (R(G)[[X]]). Initially, the invariant holds for ` = 0 by choosing α = ϕ. So suppose the current position of the game is (¯a, ¯b) ∈ V (R(G))` × V (R(G))` where ` ≤ k + 1 such that the invariant (I) is satisfied. If Spoiler decides to remove a pair of pebbles the invariant is clearly preserved by simply restricting the mapping α accordingly. So assume Spoiler wishes to play a new pair of pebbles. In this case ` ≤ k.

Claim 1. For every unpebbled v ∈ V with N(v) * X there is some w ∈ N(v) \ X which is neither pebbled nor fixed. Proof. Consider the set N(v) \ X. Since X is d-closed it holds that |N(v) \ X| ≥ d + 1. By the assumption of the lemma there are at most d − k elements in N(v) that are fixed. Thus, N(v) \ X contains at least k + 1 elements which are not fixed. Furthermore, there are at most k vertices in N(v) that are pebbled. Thus there is at least one element that is neither pebbled nor fixed. y 4.2. THE I/R-METHOD IN THEORY 55

For each unpebbled v ∈ V with N(v) * X choose a vertex wv ∈ N(v) \ X that is neither pebbled nor fixed. Furthermore let T = {w ∈ W | F (w) ⊆ cl(¯a) ∧ α(a(w)) = b(w)}. For every a ∈ M(V ) define ( A 4 (T ∩ N(v)) if |T ∩ N(v)| even Ba = A 4 (T ∩ N(v)) 4 {wv} otherwise where A ⊆ N(v) is the set with mA(v) = a and 4 denotes the symmetric difference. Now Duplicator plays the bijection  α(a) if a ∈ cl(¯a)  f : V (R(G)) → V (R(G)): a 7→ a if a ∈ F (W ) \ cl(¯a)  mBa (v) if a ∈ M(V ) \ cl(¯a). Let a ∈ V (R(G)) be the vertex chosen by Spoiler. Claim 1. There is an isomorphism α0 : R(G)[cl(¯a, a)] =∼ R(G)[cl(¯b, f(a))] such that α0(¯x) = ϕ(¯x) and α0(¯a, a) = (¯b, f(a)). Proof. For every u ∈ cl(¯a) define α0(u) := α(u). Consider the following distinction into three cases. Case a ∈ cl(¯a): In this case the claim immediately follows from Condition (I) since f(a) = α(a). Case a ∈ F (W ) \ cl(¯a): Let w ∈ W such that a ∈ F (w). Then cl(¯a, a) = cl(¯a) ∪ {a(w), b(w)}. In order to complete the definition of the function α0 set α0(a(w)) = a(w) and α0(b(w)) = b(w). Since F (w)∩cl(¯a) = ∅ it follows that cl(¯a)∩M(v) = ∅ for all v ∈ N(w) (otherwise w would be fixed). Hence, α0 : R(G)[cl(¯a, a)] =∼ R(G)[cl(¯b, f(a))]. The other two conditions are clearly satisfied. Case a ∈ M(V ) \ cl(¯a): Let v ∈ V such that a ∈ M(v). Then [ cl(¯a, a) = cl(¯a) ∪ M(v) ∪ F (w). w∈N(v) In order to complete the definition of α0 set

0 0 (i) α (a(w)) = a(w) and α (b(w)) = b(w) for all w ∈ N(v) such that w 6= wv and F (w) ∩ cl(¯a) = ∅, 0 (ii) α (mA(v)) = mB(v) where ( A 4 (T ∩ N(v)) if |T ∩ N(v)| even B = , A 4 (T ∩ N(v)) 4 {wv} otherwise and 0 0 0 (iii) α (a(wv)) = a(wv) and α (b(w)) = b(w) if |T ∩ N(v)| is even, and α (a(wv)) = b(wv) 0 and α (b(wv)) = a(wv) otherwise. 0 Note that α is defined consistently because wv is neither pebbled nor fixed and therefore 0 ∼ ¯ F (wv)∩cl(¯a) = ∅. It is easy to verify that α : R(G)[cl(¯a, a)] = R(G)[cl(b, f(a))]. The other two conditions are clearly satisfied.

y This completes the proof since Duplicator does not loose the game in a position that satisfies (I). 56 CHAPTER 4. LOWER BOUNDS

4.2.4 Meager Graphs Searching for graphs where applying the last lemma gives the desired results we generalize the notion of an `-meager graph from [75]. Definition 4.2.11 (Meager Graphs). Let G = (V, W, E) be a bipartite graph and let 0 < α < 1. The graph G is (`, α)-meager if for every ∅= 6 X ⊆ W with |X| ≤ ` it holds that

|N −1(X)| < α|X|.

Meager graphs have two properties that are advantageous. The first property is that for sufficiently small X ⊆ W the graph Aut(R(G)[[X]]) has many automorphisms. In combination with Lemma 4.2.10 this translates into finding many equivalent tuples as desired. Lemma 4.2.12. Let G = (V, W, E) be (`, α)-meager and X ⊆ W with |X| ≤ `. Then

| Aut(R(G)[[X]])| ≥ 2(1−α)|X| .

Proof. By Lemma 4.2.5 for X ⊆ W with |X| ≤ ` we have that

−1 | Aut(R(G)[[X]])| ≥ 2|X|−|N (X)| ≥ 2(1−α)|X| .

The second property that is advantageous is that in a meager graph the size of the d-closure of a set X is only by a constant factor larger than |X| itself.

Lemma 4.2.13. Suppose d ∈ N and dα < 1. Let G = (V, W, E) be (`, α)-meager and suppose d 1 ∅= 6 X ⊆ W with |X| ≤ `(1 − dα) − d + 1. Then | clG(X)| < 1−dα |X|.

Proof. Let X0 ( ··· ( Xm be a sequence of sets such that X0 = X and Xi+1 = Xi ∪ N(vi) for some vi ∈ V with |N(vi) \ Xi| ≤ d and such that Xm is d-closed. Clearly, for every i ∈ [m] −1 α α it holds that |N (Xi)| ≥ i. Suppose that m ≥ 1−dα |X| and set j = d 1−dα |X|e. Then α |Xj| ≤ |X| + dj ≤ b`(1 − dα)c − d + 1 + dd 1−dα `(1 − dα)e ≤ b`(1 − dα)c − d + 1 + dd`αe ≤ ` + b−`dαc − d + 1 + d`dαe + d − 1 = `. Hence the meagerness is applicable to Xj. This −1 α means j ≤ |N (Xj)| < α|Xj| ≤ α(|X| + dj) implying j < d 1−dα |X|e. But this contradicts the α 1 definition of j. So m < 1−dα |X| and thus, | clG(X)| = |Xm| ≤ |X| + dm < 1−dα |X|. We now concern ourselves with the existence of meager graphs. However, we require several additional properties. Indeed, in the light of Lemma 4.2.10 certain neighborhoods should be almost disjoint. Also, the graph R(G) is supposed to have only have few automorphisms, which ∗ by Lemma 4.2.7 translates into the matrix AG having large rank.

Theorem 4.2.14. There exists r0 ∈ N such that for every r ∈ N with r ≥ r0 and every n 3  sufficiently large n ∈ N there is an 10r , r -meager graph G = (V, W, E) with (I) |V | = |W | = n, (II) deg(v) = r for all v ∈ V ,

(III) |N(v1) ∩ N(v2)| < 3 for all distinct v1, v2 ∈ V , and

∗ −r (IV) rk2(AG) ≥ (1 − 2 )n. The proof of the theorem is based on the fact that bipartite expander graphs are meager. Definition 4.2.15. Let G = (V, W, E) be a bipartite graph with |V | ≥ |W |. We call G a (γ, β)-expander if for every Y ⊆ V with |Y | ≤ γ|V | it holds that |N(Y )| ≥ β|Y |. 4.2. THE I/R-METHOD IN THEORY 57

A typical method to obtain bipartite expanders is by considering the following random pro- cess. Let r ∈ N be a fixed number such that r ≥ 3. Given (disjoint) vertex sets V and W with |W | ≥ 4r and n := |V | = |W | one obtains a bipartite graph G = (V, W, E) by choosing independently and uniformly at random, for every v ∈ V , a set of r distinct neighbors in W . I refer to [146, Section 4] and [121, Chapter 5.3] for background on expander graphs, including variants of the following lemma.

Lemma 4.2.16. For r sufficiently large it holds that

  1 r   8 Pr G is a , -expander ≥ . 10r 2 9

A complete proof of this lemma is given in [124]. For the proof of Theorem 4.2.14 it remains to argue that the graphs obtained from the random process described above also satisfy Condition (III) and (IV) with high probability.

Lemma 4.2.17. It holds that

lim Pr (∃v1, v2 ∈ V : v1 6= v2 ∧ |N(v1) ∩ N(v2)| ≥ 3) = 0. n→∞

Proof. Let r ≥ 3 be a fixed constant. Let G = (V, W, E) be a bipartite graph obtained from the random process described above and let n := |V | = |W |. Let pn denote the probability that there are distinct v1, v2 ∈ V such that |N(v1)∩N(v2)| ≥ 3. Since r is a fixed number there exists a constant c1 > 0 such that  n  n ≤ c · n−3 · r − 3 1 r for all natural numbers n ∈ N. With this, the probabilities pn can be estimated by

r−3 |W |−r r  X p ≤ |V |2 s r−s |W | ≥ 4r n |W | s=0 r r−3 n  r  2 X r−3 r−s ≤ n n s=0 r r−3 −3n r  2 X c1n r r−s ≤ n n s=0 r r−3   c1 X r = · n r − s s=0 c = 2 n for some constant c2 > 0.

n Theorem 4.2.18 (cf. [32, Theorem 1.1]). For n ≥ k let Sn,k = {v ∈ F2 | |{i ∈ [n] | vi = 1}| = n×n k}. Furthermore let A ∈ F2 be a random matrix where the rows are drawn uniformly and independently from Sn,k. There is a K ∈ N such that for every fixed k ≥ K it holds that

−k  lim Pr rk2(A) ≥ 1 − 2 n = 1. n→∞ 58 CHAPTER 4. LOWER BOUNDS

Proof of Theorem 4.2.14. Let r ≥ 3 be a sufficiently large number and let G = (V, W, E) be a random bipartite graph as described above with |W | ≥ 4r and n = |V | = |W |. By Lemma 4.2.17 and Theorem 4.2.18, Conditions (III) and (IV) are satisfied with a probability that is n 3  arbitrarily close to 1 for large enough n. So it remains to show G is 10r , r -meager with a 1 r  positive probability. By Lemma 4.2.16, with probability at least 8/9, the graph G is a 10r , 2 - expander. Suppose this is indeed the case. Now assume towards a contradiction that there is a n −1 3 −1 set ∅ 6= X ⊆ W with |X| ≤ 10r for which N (X) ≥ r · |X|. Consider a set Y ⊆ N (X) with 3 n r r 3 |Y | = d r · |X|e ≤ |X| ≤ 10r . But then |X| ≥ |N(Y )| ≥ 2 · |Y | ≥ 2 · r · |X| > |X| which is a contradiction.

4.2.5 Lower Bounds for I/R-Algorithms Recall the goal is to find graphs where the search tree of an individualization-refinement algorithm is large. Applying the multipede construction to the meager graphs constructed above yields such examples. Based on Lemma 4.2.1 this can be proved by finding many equivalent tuples within the multipede graphs.

Lemma 4.2.19. Let k, d ∈ N and suppose d ≥ k and dα < 1. Let G = (V, W, E) be (`, α)-meager and let X = {w1, . . . , wm} ⊆ W be a subset of cardinality m ≤ (1 − dα)` − d + 1. Furthermore suppose  

[ N(v) ∩  N(vi) ≤ d − k i∈[k] for all distinct v, v1, . . . , vk ∈ V . Let x¯ = (x1, . . . , xm) be a sequence with xi ∈ F (wi). Then

m 1−α(d+1) m |{y¯ ∈ F (W ) | (R(G), x¯) 'k (R(G), y¯)}| ≥ 2 1−dα .

d 1 Proof. Let Xb = clG(X) be the d-closure of X. By Lemma 4.2.13 it holds that |Xb| < 1−dα |X| ≤ `. Suppose Xb = X ∪ {u1, . . . , us} and letz ¯ = (x1, . . . , xm, z1, . . . , zs) be an extension ofx ¯ with zi ∈ F (ui). By Lemmas 4.2.10 and 4.2.12 we conclude that

m+s (1−α)(m+s) |{y¯ ∈ F (W ) | (R(G), z¯) 'k (R(G), y¯)}| ≥ | Aut(R(G)[[Xb]])| ≥ 2 .

m+s Let A = {y¯ ∈ F (W ) | (R(G), z¯) 'k (R(G), y¯)} and fora ¯ ∈ A let πm(¯a) be the projection ¯ ¯ onto the first m components. Clearly, fora, ¯ b ∈ A, it holds that (R(G), πm(¯a)) 'k (R(G), πm(b)). So

m −s (1−α)(m+s)−s (1−α)m−αs |{y¯ ∈ F (W ) | (R(G), x¯) 'k (R(G), y¯)}| ≥ |A| · 2 ≥ 2 = 2 .

1 dαm Since s ≤ 1−dα m − m = 1−dα we conclude that

2 (1−α)m−αs (1−α)m− dα m 1−α(d+1) m 2 ≥ 2 1−dα = 2 1−dα .

1 Theorem 4.2.20. Let k ∈ N, ` ≥ max{9r, 45s, 18k} and α ≤ 10k . Suppose G = (V, W, E) is a bipartite graph with n := |V | = |W | such that (A) G is (`, α)-meager, (B) deg(v) = r for all v ∈ V , 4.2. THE I/R-METHOD IN THEORY 59

(C) |N(v1) ∩ N(v2)| < 3 for all distinct v1, v2 ∈ V and

∗ (D) n − rk2(AG) ≤ s. Then there is a subset I ⊆ W with |I| ≤ s such that

1. RI (G) is rigid and

2. for every k-realizable cell selector sel, every k-realizable node invariant inv and every k- realizable refinement operator ref it holds that

ref,sel 1 I 36 ·` |Tinv [R (G)]| ≥ 2 . (4.1)

Proof. Set d = 3k. By Lemma 4.2.7 there exists a set I ⊆ W of size |I| ≤ s such that RI (G) is rigid. Suppose I = {w1, . . . , ws}.

Claim 1. For distinct vertices v, v1, . . . , vk ∈ V it holds that  

[ N(v) ∩  N(vi) ≤ d − k. i∈[k]

Proof. It holds that  

[ [ X N(v) ∩  N(vi) = (N(v) ∩ N(vi)) ≤ |N(v) ∩ N(vi)| ≤ 2k ≤ d − k

i∈[k] i∈[k] i∈[k] using Property (C). y For the remainder of the proof fix an arbitrary linear order on the vertex set W . For a vertex mA(v) ∈ V (R(G)) define Π(mA(v)) to be the sequence (x1, . . . , xr) of the vertices in N(mA(v)) ordered according to the linear order of W (observe that for each w ∈ N(v) either a(w) or b(w) occurs in the sequence). Also, for w ∈ W define Π(a(w)) = (a(w)) and Π(b(w)) = (b(w)). t For a sequencex ¯ = (x1, . . . , xt) ∈ V (R(G)) define Π(¯[x) as the concatenation of the sequences

Π(x1),..., Π(xt). Moreover, let Π(¯x) be the subsequence of Π(¯[x) in which all duplicates are removed starting from right. (More precisely, for a sequencey ¯ = (y1, . . . , yt) inductively define the sequence ρ(¯y) by setting ρ(ε) = ε and letting ρ(y1, . . . , yt) := ρ(y1, . . . , yt−1), if yt = yi for some i < t, and ρ(y1, . . . , yt) is equal to the concatenation of ρ(y1, . . . , yt−1) and yt, otherwise. Then Π(¯x) = ρ(Π(¯[x)).) Claim 2. For every sequencex ¯ of vertices of R(G) it holds that

I I I I |{y¯ | (R (G), y¯) 'k (R (G), x¯)}| ≥ |{z¯ | (R (G), z¯) 'k (R (G), Π(¯x))}| −s ≥ 2 · |{z¯ | (R(G), z¯) 'k (R(G), Π(¯x))}|.

Proof. First observe the second inequality holds since there are only 2s color preserving permu- tations of F (I). I I For the first inequality suppose (R (G), z¯) 'k (R (G), Π(¯x)) for some sequencez ¯. Then there is a lifty ¯ = (y1, . . . , yt) such that Π(¯y) =z ¯ and yi and xi have the same color for all i ∈ [t]. For 60 CHAPTER 4. LOWER BOUNDS

I I this lift it holds that (R (G), y¯) 'k (R (G), x¯). Since lifts of distinct sequences must be distinct, the first inequality of the claim follows. y Now define  3k   1  1 t := b(1 − dα)` − d + 1c ≥ 1 − ` − d + 1 ≥ ` − d + 1 ≥ ` 10k 2 3

1−α(d+1) 1−α(3k+1) 1 and let c = 1−dα = 1−3kα ≥ 2 . Then, for every vertex sequencex ¯ such that |Π(¯x)| ≤ t−s, it holds that I I c|x¯|/2−s |x¯|/4−s |{z¯ | (R (G), z¯) 'k (R (G), Π(¯x))}| ≥ 2 ≥ 2 (4.2) by Lemma 4.2.19 and Claim 2. (Here the extra 1/2 in the exponent comes from the fact that in Lemma 4.2.19 for each wi only one xi ∈ F (wi) can be chosen but here Π(¯x) may contain both vertices from F (wi).) ref,sel I Claim 3. There is a sequencey ¯ ∈ V (Tinv [R (G)]) such that t − s − r < |Π(¯y)| ≤ t − s. ref,sel I Proof. Letx ¯ = (x1, . . . , xm) be a leaf of Tinv [R (G)]. Observe that πi(¯x) := (x1, . . . , xi) ∈ ref,sel I V (Tinv [R (G)]) for every i ∈ [m]. Now define ti := |Π(πi(¯x))| for i ∈ [m]. Note that ti+1 ≤ ti + r, i ∈ [m − 1], since |Π(xi)| ≤ r. It thus suffices to show that |Π(¯x)| > t − s − r. Suppose towards a contradiction that |Π(¯x)| ≤ t − s − r. We argue that ref(RI (G), x¯) is not ref,sel I discrete. This impliesx ¯ is not a leaf of Tinv [R (G)] giving a contradiction. Letx ¯0 be a sequence of whichx ¯ is a prefix that satisfies 4s < |Π(¯x0)| ≤ t−s (possiblyx ¯0 =x ¯). Now it suffices to show that ref(RI (G), x¯0) is not discrete. Indeed, by Equation (4.2), it holds I I 0 1/4|Π(¯x0)|−s s−s that |{z¯ | (R (G), z¯) 'k (R (G), Π(¯x ))}| ≥ 2 > 2 = 1. But this implies that I 0 ref(R (G), x¯ ) is not discrete. y ref,sel I Now lety ¯ ∈ V (Tinv [R (G)]) be the sequence from Claim 3. Then I I c|y¯|/2−s |y¯|/4−s t/4−r/4−5/4s |{z¯ | (R (G), z¯) 'k (R (G), Π(¯y))}| ≥ 2 ≥ 2 ≥ 2 ≥ 2`/12−`/36−`/36 ≥ 2`/36

I I `/36 using Equation (4.2). By Claim 2 this means |{z¯ | (R (G), z¯) 'k (R (G), y¯)}| ≥ 2 . So the statement of the theorem follows from Lemma 4.2.1. In combination with Theorem 4.2.14 which guarantees the existence of the graphs required by the last theorem, this gives the main technical result of this section implying exponential worst-case complexity for all I/R algorithms based on k-realizable operators.

Theorem 4.2.21. For every constant k ∈ N there is a family of rigid colored graphs (Gn)n∈N with |V (Gn)| ≤ n such that for every k-realizable cell selector sel, every k-realizable refinement operator ref, and every k-realizable node invariant inv it holds that

ref,sel Ω(n) |Tinv [Gn]| ∈ 2 . (4.3) Together with Proposition 2.3.1 this implies exponential lower bounds on the running time of individualization-refinement algorithms performing canonization. Moreover, one can also ob- tain a similar lower bound for isomorphism tests based on individualization refinement. Con- ref,sel sider two sequencesv ¯1, v¯2 of vertices such that the subtrees of Tinv [Gn] rooted atv ¯1 andv ¯2 have exponential size. Then, by individualizing the sequences, one obtains two graphs (Gn, v¯1) and (Gn, v¯2) such that for the isomorphism test based on comparing leaves, both search trees ref,sel of Tinv [Gn, v¯1,Gn, v¯2] are exponentially large. By a proposition analogous to Proposition 2.3.1 for input pairs this implies exponential running time since at least one of the trees has to be completely traversed by the algorithm and no automorphism pruning is possible. 4.3. THE I/R-METHOD IN PRACTICE 61

Remark 4.2.22. While the graphs constructed for Theorem 4.2.21 are colored graphs, it is not difficult to turn them into uncolored graphs while preserving the exponential size of the search tree. Indeed, let (G, χG) be a colored graph of minimum degree 1 (i.e., there are no isolated ∗ vertices) and let t = | im(χG)| be the number of colors. Now consider the uncolored graph G which is constructed from the disjoint union of G and a path of length t + 1. To encode the colors, the i-th vertex of the path is connected to all vertices colored by the i-th color. Finally, four more additional vertices are added. The first vertex is connected to all but the last vertex of the path whereas the other vertices are used to uniquely mark the first vertex by attaching a single vertex and a path of length 1. Note that in the resulting (uncolored) graph G∗, there are three vertices of degree 1. Applying the Color-Refinement algorithm to the graph G∗, every newly added vertex forms a singleton color class and the partition induced by the color classes on V (G) is the same as in the original graph. In particular, if G is rigid, then also G∗ is rigid. Also this implies that each search tree of G∗ corresponds to a search tree of G of at least the same size.

4.3 The I/R-Method in Practice

The individualization-refinement paradigm is implemented in the currently fasted software pack- ages tackling the Graph Isomorphism Problem in practice. In particular, this includes the pack- ages Nauty/Traces [113, 112], Bliss [85, 86], Conauto [105] and Saucy [35]. Having provided exponential lower bounds for all these algorithms in the previous section, a natural question to ask is whether the multipede graphs can also be utilized to construct instances that are practi- cally difficult. On first glance it may seem this question is already answered by the results from the last section which guarantee exponential running times of all the software packages on certain multipede graphs. However, this is not the case. Indeed, the theoretical analysis provided in the last section completely ignores the constant factors involved in the construction as well as in the analysis. Example 4.3.1. Consider a bipartite graph G = (V, W, E) such that n := |V | = |W | and r := deg(v) for all v ∈ V . Then |V (R(G))| = 2n + 2r−1n = (2 + 2r−1)n. By Theorem 4.2.14 one n 3 ∗ −r may assume that G is (`, α)-meager where ` = 10r and α = r . Furthermore, n−rk2(AG) ≤ 2 n. Assuming Theorem 4.2.20 is applicable it yields a set I ⊆ W such that, for every k-realizable cell selector sel, every k-realizable node invariant inv and every k-realizable refinement operator ref it holds that ref,sel 1 1 1 ·|V (R(G))| I 36 ·` 360r ·n 360r·(2+2r−1) |Tinv [R (G)]| ≥ 2 = 2 = 2 . 1 Since cr := 360r·(2+2r−1) is a fixed constant it is ignored for the theoretical analysis of I/R algorithms given in the previous section. However, in a practical setting, this constant is of course significant. In order to obtain a first estimate of the constant cr consider the condition −r 1 45 · 2 n ≤ 10r n (see Theorem 4.2.20) which translates to r ≥ 9. This implies 1 1 c ≤ = r 360 · 9 · 258 835920 which means that, even if R(G) may have one million vertices, the lower bound provided by Theorem 4.2.21 cannot even guarantee that the search tree has only four nodes. This is of course completely unsatisfactory from a practical point of view. These calculations show that that the practical consequences of Theorem 4.2.21 can essentially be ignored. But of course, one can still expect that in practice the multipede graphs may result in difficult instances for isomorphism testing. 62 CHAPTER 4. LOWER BOUNDS

Figure 4.4: The graph G8

In order to explore this question, let us start by identifying requirements for the graphs G that may lead to R(G) being difficult for practical I/R algorithms. The last example already indicates that one essential ingredient is to keep the degree r of the base graph G as small as possible. Indeed, the degree r has an exponential impact on the blow-up factor from G to R(G). With this in mind it seems plausible to restrict to graphs G where r = 3. However, this restriction raises another problem on the number of automorphisms of the graph R(G). Indeed, following the random construction presented in the last section, the graph R(G) may have many automorphisms. Actually, it would be optimal for the graph G to be odd (see Definition 4.2.2 and Lemma 4.2.3). In the following a simple randomized construction is presented that achieves both these requirements. Also, applying the multipede construction, this leads to the practically most difficult graphs available today. As this thesis primarily studies the Graph Isomorphism Problem from a theoretical perspective I do not give much details in this section and rather refer to the original work [123]. For a graph G = (V,E) and a random permutation σ : E → E define the bipartite graph B(G, σ) = (VB,WB,EB) where VB := V × {0, 1}, WB := E, and

EB := {(v, 0)e | v ∈ V, e ∈ E, v ∈ e} ∪ {(v, 1)e | v ∈ V, e ∈ E, v ∈ σ(e)}.

Also, for n ≥ 2 define Gn = ([2n],En) to be a cycle of length of 2n with diagonals added (see Figure 4.4 for an example). Formally,

En := {{i, (i + 1)} | i ∈ [2n − 1]} ∪ {{2n, 0}} ∪ {{i, i + n} | i ∈ [n]}.

The central result of this section is that the graphs B(Gn, σ) for randomly chosen permu- tations σ : E(G) → E(G) give, by applying the multipede construction, extremely difficult in- stances for all practical I/R algorithms with high probability. A key property is that, with high probability, B(Gn, σ) is odd (see Definition 4.2.2).

Proposition 4.3.2 (N., Schweitzer [123]). For n ≥ let qn be the probability that the graph B(Gn, σ) is odd where the probability ranges over all permutations σ : E(Gn) → E(Gn). Then

lim qn = 1. n→∞

In combination with Lemma 4.2.3 this implies the graphs R(B(Gn, σ)) are rigid with high probability. Actually, this statement can be strengthened by ignoring the coloring of the vertices of R(B(Gn, σ)).

Proposition 4.3.3 (N., Schweitzer [123]). For n ≥ let pn be the probability that the uncol- ored graph R(B(Gn, σ)) is rigid where the probability ranges over all permutations σ : E(Gn) → E(Gn). Then lim pn = 1. n→∞ 4.3. THE I/R-METHOD IN PRACTICE 63

·104

timeout 104 1 Bliss Traces 103 0.8 Nauty Saucy 102

sec 0.6 Conauto 1

4 10 sec

10 0.4 100

0.2 10−1

0 10−2 0 500 1,000 1,500 2,000 2,500 3,000 0 500 1,000 1,500 2,000 2,500 3,000 number of vertices number of vertices ·104

timeout 104 1 Bliss Traces 103 0.8 Nauty Saucy 102

sec 0.6 Conauto 1

4 10 sec

10 0.4 100

0.2 10−1

0 10−2 0 500 1,000 1,500 2,000 0 500 1,000 1,500 2,000 2,500 3,000 number of vertices number of vertices

Figure 4.5: Performance of various algorithms on the multipede graphs R(B(Gn, σ)) on the top ∗ and R(B (Gn, σ)) on the bottom random permutations σ. The plots are given in linear (left) and logarithmic scale (right).

Note that omitting the vertex-coloring only makes the isomorphism and canonization problem more difficult for the practical solvers. Hence, it is natural to consider only uncolored graphs in this context. In a practical scenario it is imperative to keep the constants involved in all constructions as small as possible. One possible way to reduce the number of vertices in the constructions presented above has already been presented in the previous section. Given a 3-regular graph G with 2n vertices and a permutation σ : E(G) → E(G) let B(G, σ) = (VB,WB,EB). Then |VB| = 4n and |WB| = 3n. Now suppose B(G, σ) is odd. Then, by Corollary 4.2.6, there is a set 0 0 0 ∗ 0 VB ⊆ VB such that |VB| ≤ 3n and B(G, σ)[VB ∪ WB] is odd. Let B (G, σ) := B(G, σ)[VB ∪ WB]. In order to analyze the performance of the practical solvers, [123] performs experiments for the constructed graphs some of which are depicted in Figure 4.5. The experiments were each performed on single node of a compute cluster with 2.00 GHz Intel Xeon X5675 processors. The time limit is set to three hours (i.e., 10800 seconds) and the memory limit to 4 GB. Every single instance is processed once, but multiple instances are generated for each possible number of vertices with the same construction. Also, the following isomorphism solvers are evaluated: Bliss version 0.72, Nauty/Traces version 25r9, Saucy version 3.0, and Conauto version 2.03. The experiments show that, already for roughly 1500 vertices, there are graphs for which isomorphism testing cannot be done efficiently using the standard I/R paradigm. A more detailed 64 CHAPTER 4. LOWER BOUNDS experimental evaluation of these constructions (including a comparison to other constructions) can be found in [123]. Chapter 5

Group Theory

In the last chapter we have seen the limits of combinatorial approaches to the Graph Isomorphism Problem. Both the Weisfeiler-Leman algorithm and the I/R paradigm fail to solve the Graph Isomorphism Problem in subexponential time (under some moderate assumptions for the I/R paradigm). In this chapter we consider another approach to tackle this problem which is based on group-theoretic techniques. This approach was first utilized by Babai in 1979 [8] in order to solve the isomorphism problem for graphs of bounded color class size and later generalized by Luks resulting in a polynomial-time isomorphism test for graphs of bounded degree [106]. In particular, Luks’s algorithm highlighted the power of the group-theoretic machinery which has been exploited ever since. In his recent breakthrough paper showing that the Graph Isomorphism Problem can be solved in quasipolynomial time [11] Babai extended the scope of the group- theoretic techniques even further introducing algorithmic methods in order to handle certain types of groups that are not manageable using Luks’s techniques. In combination with various new insights on the power of combinatorial approaches like the Weisfeiler-Leman algorithm, this allowed the design of an isomorphism test with quasipolynomial complexity. In this chapter, I introduce the group-theoretic machinery developed in the context of the Graph Isomorphism Problem. In the next two chapters, these techniques are utilized and further extended to give improved algorithms for isomorphism tests for bounded degree and bounded tree-width graphs.

5.1 Permutation Groups

First, the necessary background and notation on permutation groups is presented. For general background on group theory I refer to [136] whereas [45] serves as a reference for permutation groups.

5.1.1 Basics A group is a pair (Γ, ·) where Γ is a set and ·:Γ × Γ → Γ is a binary operation such that

(G.1) γ1 · (γ2 · γ3) = (γ1 · γ2) · γ3 for all γ1, γ2, γ3 ∈ Γ,

(G.2) there is an identity element id ∈ Γ such that id ·γ = γ · id = γ for all γ ∈ Γ, and

(G.3) for all γ ∈ Γ there exists an inverse γ−1 ∈ Γ such that γ · γ−1 = γ−1 · γ = id.

65 66 CHAPTER 5. GROUP THEORY

As is standard in group theory we shall not distinguish between a group and its ground set and simply write Γ for both objects. Also, we shall write γ1γ2 instead of γ1 · γ2. The cardinality of a group, |Γ|, is the cardinality of its ground set. In this thesis we shall only be interested in finite groups, i.e., groups with a finite ground set. A subset ∆ ⊆ Γ is a subgroup of Γ, denoted ∆ ≤ Γ, if ∆ is closed under the multiplication operation and inverses. Let ∆ ≤ Γ. A (right) coset of ∆ in Γ is a set ∆γ := {δγ | δ ∈ ∆} where γ ∈ Γ. The index of ∆ in Γ is the number of cosets, denoted by |Γ : ∆| := |{∆γ | γ ∈ Γ}|.A transversal for ∆ in Γ is a set {γ1 = id, γ2, . . . , γt} of representatives for all cosets, t = |Γ : ∆|. By Lagrange’s Theorem the cosets form an equipartition of the group Γ and |Γ| = |Γ : ∆| · |∆|. A subgroup N ≤ Γ is normal, denoted N E Γ, if γN = Nγ for all γ ∈ Γ. Let N E Γ. Then there is a group structure on the cosets of N in Γ which results in the factor group Γ/N with ground set {Nγ | γ ∈ Γ} and group operation (Nγ1) · (Nγ2) := Nγ1γ2. A group Γ is simple if it has no non-trivial normal subgroup, i.e., the only normal subgroups of Γ are the trivial group {id} and Γ itself. A group Γ is Abelian if γδ = δγ for all γ, δ ∈ Γ. Let Γ and ∆ be two groups. A homomorphism from Γ to ∆ is a mapping ϕ:Γ → ∆ such that ϕ ϕ(γ1γ2) = ϕ(γ1)ϕ(γ2) for all γ1, γ2 ∈ Γ. For γ ∈ Γ we denote by γ := ϕ(γ) the image of γ under the homomorphism ϕ. Also, for Γ0 ≤ Γ we denote by (Γ0)ϕ ≤ ∆ the image of Γ0 under ϕ. The kernel of the homomorphism is ker(ϕ) := {γ ∈ Γ | ϕ(γ) = id}. The kernel is a normal subgroup of Γ, i.e., ker(ϕ) E Γ. An epimorphism is a surjective homomorphism and a monomorphism is an injective homomorphism. A bijective homomorphism is called an isomorphism. Two groups Γ and ∆ are isomorphic if there is an isomorphism from Γ to ∆. Let Γ be a group. A set S ⊆ Γ is a generating set for Γ if each γ ∈ Γ can be written as −1 −1 −1 γ = s1 . . . sk for some elements si ∈ S ∪ S , i ∈ [k], where S := {s | s ∈ S}. For a homomorphism ϕ:Γ → ∆ and a generating set S for Γ it suffices to specify the images ϕ(s), s ∈ S, in order to completely determine the homomorphism ϕ. From the perspective of group theory this thesis is mostly concerned with permutation groups where group elements are permutations and the group operation is concatenation. The symmetric group on a finite set Ω is denoted Sym(Ω). A permutation group acting on a set Ω is a subgroup Γ ≤ Sym(Ω) of the symmetric group. The size of the permutation domain Ω is called the degree of Γ and, throughout this thesis, is denoted by n := |Ω|. If Ω = [n] then we also write Sn instead of Sym(Ω). For γ ∈ Γ and α ∈ Ω the image of α under the permutation γ is denoted by αγ := γ(α). The set αΓ := {αγ | γ ∈ Γ} is the orbit of α (under the group Γ). A group Γ is transitive if αΓ = Ω for some (and therefore every) α ∈ Ω. For a Γ-invariant set A ⊆ Ω (i.e., A is a union of orbits of Γ) we denote by Γ[A] ≤ Sym(A) the restriction of Γ to the orbit A. This is also referred to as the induced action of Γ on A. For γ ∈ Γ we also denote by γ[A] the restriction of γ to the set A. Hence, Γ[A] = {γ[A] | γ ∈ Γ}. γ For α ∈ Ω the group Γα := {γ ∈ Γ | α = α} ≤ Γ is the stabilizer of α in Γ. A permutation group Γ ≤ Sym(Ω) is semi-regular if Γα = {id} for every α ∈ Ω. If additionally Γ is transitive the group is called regular. For A ⊆ Ω and γ ∈ Γ define Aγ := {αγ | α ∈ A}. The pointwise γ stabilizer of A is the subgroup Γ(A) := {γ ∈ Γ | ∀α ∈ A: α = α}. The setwise stabilizer of A is γ the subgroup ΓA := {γ ∈ Γ | A = A}. Observe that Γ(A) ≤ ΓA. Let Γ ≤ Sym(Ω) be a transitive group. A block of Γ is a nonempty subset B ⊆ Ω such that Bγ = B or Bγ ∩ B = ∅ for all γ ∈ Γ. The trivial blocks are Ω and the singletons {α} for α ∈ Ω. The group Γ is called primitive if there are no non-trivial blocks. If Γ is not primitive it is called imprimitive. If B ⊆ Ω is a block of Γ then B = {Bγ | γ ∈ Γ} builds a block system of Γ. Note that B is an equipartition of Ω. Also recall that for any partition B of Ω and S ⊆ Ω we denote by B[S] := {B ∩ S | B ∈ B : B ∩ S 6= ∅} the induced subpartition of B. The group γ Γ(B) = {γ ∈ Γ | ∀B ∈ B: B = B} denotes the subgroup stabilizing each block B ∈ B setwise. The group Γ(B) is a normal subgroup of Γ. The natural action of Γ on the block system B is 5.1. PERMUTATION GROUPS 67 denoted by Γ[B] ≤ Sym(B). More generally, if A is a set of objects on which Γ acts naturally, we denote by Γ[A] ≤ Sym(A) the action of Γ on the set A. A block system B is minimal if there is no non-trivial block system B0 such that B ≺ B0. A block system B is minimal if and only if Γ[B] is primitive. Let Γ ≤ Sym(Ω) and ∆ ≤ Sym(Ω0) be two permutation groups. The groups Γ and ∆ are permutationally equivalent if there is a bijection σ :Ω → Ω0 such that ∆ = {σ−1γσ | γ ∈ Γ}. In this case σ is a permutational isomorphism from Γ to ∆. A permutational automorphism of a permutation group Γ is a permutational isomorphism from Γ to itself. Finally, two group operations need to be defined that build specific types of groups out of a set of smaller groups. The simplest way to build a group from two given groups is by taking the direct product.

Definition 5.1.1 (Direct Product). Let Γ1 ≤ Sym(Ω1) and Γ2 ≤ Sym(Ω2) be two permutation groups. Then Γ1 × Γ2 := {(γ1, γ2) | γ1 ∈ Γ1, γ2 ∈ Γ2} forms a group where multiplication is defined componentwise as 0 0 0 0 (γ1, γ2) · (γ1, γ2) = (γ1γ1, γ2γ2).

There are two natural permutation actions for this group. First, the group acts on the set Ω1 ]Ω2 via ( γ1 α if α ∈ Ω1 α(γ1,γ2) = . γ2 α if α ∈ Ω2

Also, the group Γ1 × Γ2 acts on the set Ω1 × Ω2 via

(γ1,γ2) γ1 γ2 (α1, α2) = (α1 , α2 ). The second important operation for this thesis is the wreath product of two given groups. Definition 5.1.2 (Wreath Product). For Γ ≤ Sym(Ω) and ∆ ≤ Sym(M) the wreath product Γ o ∆ is the group with elements ((γi)i∈M , δ) where γi ∈ Γ for all i ∈ M and δ ∈ ∆. The multiplication is defined as

0 0 0 0 ((γi)i∈M , δ) · ((γi)i∈M , δ ) = ((γ(δ0)−1(i)γi)i∈M , δδ ). In the standard action (imprimitive action), the wreath product Γ o ∆ acts on the set Ω × M via

(α, i)((γj )j∈M ,δ) = (αγδ(i) , iδ).

A second important action of the wreath product is the product action (primitive action), where Γ o ∆ acts on the set ΩM via

((γi)i∈M ,δ) γδ−1(i)  ((α ) ) = α −1 . i i∈M δ (i) i∈M

5.1.2 Algorithms for Permutation Groups In order to handle permutation groups computationally it is essential to represent groups in a compact way. Indeed, the size of a permutation group is typically exponential in the degree of the group which means it is not possible to store the whole group in memory. In order to allow for efficient computation permutation groups are represented by generating sets. By Lagrange’s Theorem, for each permutation group Γ ≤ Sym(Ω), there is a generating set of size log |Γ| ≤ n log n1. This subsection gives the basic polynomial-time library handling computations

1Actually, using slightly more involved arguments, it can be proved that Γ has a generating set of size n − 1 [111]. 68 CHAPTER 5. GROUP THEORY for permutation groups. For detailed background on computational group theory I refer to [77, 138]. In order to build methods for the basic computational tasks concerning permutation groups the central tool is the concept of a strong generating set. I do not give a definition of strong generating sets in this thesis and rather refer to [138, Chapter 4]. Theorem 5.1.3. Let Γ ≤ Sym(Ω) be a permutation group. There is an algorithm that, given a generating set S for Γ, computes a strong generating set S∗ for Γ of size |S∗| ≤ n2 in time polynomial in n and |S|. Theorem 5.1.4. Let Γ ≤ Sym(Ω) given by a generating set S ⊆ Γ. Then the following tasks can be solved in time polynomial in n and |S|: 1. compute |Γ|, 2. given γ ∈ Sym(Ω), determine whether γ ∈ Γ, 3. compute the orbits of Γ,

4. given α ∈ Ω, compute a generating set for Γα, and 5. compute a minimal block system of Γ. Let ∆ ≤ Sym(Ω0) be a second group of degree n0 = |Ω0|. Also let ϕ:Γ → ∆ be a homomorphism given by a list of images for s ∈ S. Then the following tasks can be solved in time polynomial in n, n0 and |S|: 6. compute a generating set for ker(ϕ), and 7. given δ ∈ ∆, find γ ∈ Γ such that ϕ(γ) = δ (if it exists). Note that, using Theorem 5.1.3, one can essentially assume generating sets having at most quadratic size. In particular, this enables us to concatenate polynomial-time computations on permutation groups by controlling the size of all generating sets computed in intermediate steps of the computation. For the remainder of this thesis, I will typically ignore the role of generating sets and sim- ply refer to groups being the input/output of an algorithm. This always means the algorithm performs computations on generating sets of size polynomial in the degree of the group.

5.1.3 Groups with Restricted Composition Factors An important class of groups that for example naturally arises from studying automorphism groups of graphs of small degree is the class of Γbd-groups which is defined by restricting the composition factors of a group. Definition 5.1.5 (Composition Series). Let Γ be a group. A subnormal series of Γ is a sequence Γ = Γ0 D Γ1 D ··· D Γk = {id}.A composition series of Γ is a strictly decreasing subnormal series of maximal length.

Let Γ be a group and Γ = Γ0 D Γ1 D ··· D Γk = {id} a composition series. The groups Γi−1/Γi are the factor groups of the series. For a fixed group, every composition series has the same multiset of factor groups (see [136]). The composition factors of Γ are the factor groups of any composition series of Γ. Note that the composition factors of a group are simple groups. From an algorithmic perspective, the composition factors of a permutation group (given by a generating set) can be computed in polynomial time (see also [77, 138] for more details). 5.2. STRING ISOMORPHISM 69

Theorem 5.1.6 (Luks [107]). Let Γ ≤ Sym(Ω) be a permutation group. There is a polynomial time algorithm that, given a generating for Γ, computes a composition series Γ = Γ0 D Γ1 D ··· D Γk = {id} for Γ and, for every i ∈ [k], a monomorphism

ϕi :Γi−1/Γi → Sn.

(The monomorphism ϕi is specified by listing images of elements of a generating set for Γi−1/Γi.)

Definition 5.1.7 (Luks [106]). For d ≥ 2 define Γbd to be the class of all groups Γ such that every composition factor of Γ is isomorphic to a subgroup of Sd.

The class of Γbd-groups plays a significant role in the research on the Graph Isomorphism Problem for two reasons. First, Γbd-groups naturally appear in the study of certain graph classes, most notably in the study of automorphism groups of graphs of bounded degree. On the other hand, Γbd-groups are severely restricted in their structure which admits the design of efficient algorithms for certain problems not known to be tractable in general. Before exploring these aspects in more detail, some basic properties of Γbd-groups are stated.

Lemma 5.1.8 (Luks [106]). Let Γ ∈ Γbd. Then

1. ∆ ∈ Γbd for every ∆ ≤ Γ, and ϕ 2. Γ ∈ Γbd for every homomorphism ϕ:Γ → ∆.

Lemma 5.1.9 (Luks [106]). Let Γ be a group and ϕ:Γ → ∆ a homomorphism. Then Γ ∈ Γbd if ϕ and only if ker(ϕ) ∈ Γbd and Γ ∈ Γbd. Corollary 5.1.10. Let Γ ≤ Sym(Ω) be a group such that every orbit of Γ has size at most d. Then Γ ∈ Γbd. The next lemma follows from [13, Lemma 2.2].

n−1 Lemma 5.1.11. Suppose d ≥ 6 and let Γ ≤ Sym(Ω) be a Γbd-group. Then |Γ| ≤ d .

From the algorithmic perspective, the membership problem for Γbd-groups is fixed-parameter tractable (for parameter d). Corollary 5.1.12. There is an algorithm that, given a group Γ ≤ Sym(Ω), decides whether c Γ ∈ Γbd in time f(d)n for some function f and a constant c.

Proof. By Theorem 5.1.6 one can compute permutation representations ∆i ≤ Sn, i ∈ [k], for all composition factors of Γ in polynomial time. Hence, it only remains to check whether ∆i is isomorphic to a subgroup of Sd for all i ∈ [k]. For every i ∈ [k] let Ai ⊆ [n] be a non-trivial orbit of ∆i. Since ∆i is a simple group it follows that ∆i[Ai] is isomorphic to ∆i. So it suffices to check if ∆i[Ai] is isomorphic to a subgroup of Sd. This is impossible if |Ai| > d!. So assume |Ai| ≤ d! for all i ∈ [k]. Now the algorithm checks whether ∆i[Ai] is isomorphic to a subgroup of Sd by brute force. This can be done in time f(d) for some function f.

5.2 String Isomorphism 5.2.1 Graphs of Bounded Color Class Size In the context of the Graph Isomorphism Problem, the group-theoretic machinery was first utilized by Babai [8] to design a polynomial-time isomorphism test graphs of bounded color class size. While Babai’s original algorithm is a randomized Las Vegas algorithm the methods were derandomized only shortly after by Furst, Hopcroft and Luks [57]. 70 CHAPTER 5. GROUP THEORY

Definition 5.2.1 (Color Class Size). Let G = (V, E, χ) be a vertex-colored graph. The graph G has color class size at most s if |χ−1(c)| ≤ s for every c ∈ im(χ).

Theorem 5.2.2. There is a function f : N → N and an absolute constant c ∈ N such that the Graph Isomorphism Problem for graphs that have color class size at most s can be solved in time f(s)nc.

Algorithm 2: Automorphism Group for Graphs of Color Class Size s Input : Graph G = (V, E, χ) that has color class size at most s Output: A generating set for Aut(G)

1 compute V1,...,V` the color classes of G 2 let Γ := Sym(V1) × · · · × Sym(V`) ≤ Sym(V ) 3 for i, j = 1, . . . , ` do 4 compute ϕ:Γ → Γ[Vi ∪ Vj] 5 ∆ := Aut(G[Vi ∪ Vj]) ∩ Γ[Vi ∪ Vj] −1 6 Γ := ϕ (∆) 7 end 8 return Γ

Proof Sketch. Let (G1, χ1) and (G2, χ2) be two colored graphs of color class size at most s. By treating connected components independently it can be assumed that G1 and G2 are connected. Let (G, χ) be the disjoint union of the two graphs. Observe that G has color class size at most 2s. It suffices to compute the automorphism group Aut(G, χ). This problem is solved by Algorithm 2. The algorithm computes a sequence of smaller and smaller groups by iterating over all pairs of color classes Vi and Vj. In iteration (i, j) the algorithm updates the current group in such a way that it respects all edges between vertices from Vi and Vj. This way, after iterating over all pairs (i, j) ∈ [`]2, exactly those permutations remain that respect all the edges of the graph G.

Already, this first and simple algorithm demonstrates the power of the group-theoretic ma- chinery. The above algorithm can efficiently solve the isomorphism problem for the Cai-F¨urer- Immerman and multipede graphs constructed in the previous chapter that proved to be extremely difficult for purely combinatorial algorithms like the Weisfeiler-Leman algorithm and I/R algo- rithms (see Theorem 4.1.1 and 4.2.21). Actually, the CFI graphs also prove to be hard to distin- guish for relaxation hierarchies in proof complexity theory (e.g., Sherali-Adams hierarchies, Sum of Squares, Gr¨obnerbasis computations) [5, 126, 140, 25, 26, 72]. Again, this underlines the sig- nificance of group-theoretic approaches to graph isomorphism as already very simple techniques cannot be captured by other standard algorithmic approaches to this problem.

5.2.2 String Isomorphism Problem In order to apply group-theoretic techniques in greater depth Luks [106] introduced a more general problem that allows to build recursive algorithms along the structure of the permutation groups involved. For this thesis, I follow the notation and terminology used by Babai [10, 11] for describing his quasipolynomial time algorithm for the Graph Isomorphism Problem that also employs this recursive strategy. 5.2. STRING ISOMORPHISM 71

Recall a string is a mapping x:Ω → Σ where Ω is a finite set and Σ is also a finite set called the alphabet. Let γ ∈ Sym(Ω) be a permutation. The permutation γ can be applied to the string x by defining  −1  xγ :Ω → Σ: α 7→ x αγ . (5.1)

Let y:Ω → Σ be a second string. The permutation γ is an isomorphism from x to y, denoted γ : x =∼ y, if xγ = y. Let Γ ≤ Sym(Ω). A Γ-isomorphism from x to y is a permutation γ ∈ Γ ∼ ∼ such that γ : x = y. The strings x and y are Γ-isomorphic, denoted x =Γ y, if there is a Γ- isomorphism from x to y. The String Isomorphism Problem asks, given two strings x, y:Ω → Σ and a generating set for a group Γ ≤ Sym(Ω), whether x and y are Γ-isomorphic. The set of Γ-isomorphisms from x to y is denoted by

γ IsoΓ(x, y) := {γ ∈ Γ | x = y}. (5.2)

The set of Γ-automorphisms of x is AutΓ(x) := IsoΓ(x, x). Observe that AutΓ(x) is a subgroup of Γ and IsoΓ(x, y) = AutΓ(x)γ for an arbitrary γ ∈ IsoΓ(x, y). Theorem 5.2.3. There is a polynomial time many-one reduction from the Graph Isomorphism Problem to the String Isomorphism Problem. Proof. Let G and H be two graphs and assume without loss of generality V (G) = V (H) =: V . V  Let Ω := 2 and Γ ≤ Sym(Ω) be the natural action of Sym(V ) on the two-element subsets of V . Let x:Ω → {0, 1} with x(vw) = 1 if and only if vw ∈ E(G). Similarly define y:Ω → {0, 1} with y(vw) = 1 if and only if vw ∈ E(H). Let ϕ: Sym(V ) → Γ be the natural homomorphism. Then γ : G =∼ H if and only if γϕ : x =∼ y. The main advantage of the String Isomorphism Problem is that it naturally allows for algo- rithmic approaches based on group-theoretic techniques. An excellent example was essentially already given in the previous section: if all orbits of the input group have size at most s then the problem can be solved in time f(s)nc for some function f and a constant c. In the following the group theoretic approaches to the String Isomorphism Problem and Graph Isomorphism Problem are exploited in more depth.

5.2.3 Recursion Mechanisms The foundation for the group-theoretic approaches lays in two recursion mechanisms first ex- ploited by Luks [106]. As before let x, y:Ω → Σ be two strings. For a set of permutations K ⊆ Sym(Ω) and a window W ⊆ Ω define

W γ IsoK (x, y) := {γ ∈ K | ∀α ∈ W : x(α) = y(α )}. (5.3) In this thesis, the set K is always a coset, i.e., K = Γγ for some group Γ ≤ Sym(Ω) and a permutation γ ∈ Sym(Ω), and the set W is Γ-invariant. In this case it can be shown that W W W W IsoK (x, y) is either empty or a coset of the group AutΓ (x) := IsoΓ (x, x). Hence, the set IsoK (x, y) W W can be represented by a generating set for AutΓ (x) and a single permutation γ ∈ IsoK (x, y). Moreover, for K = Γγ, it holds that

W W γ−1 IsoΓγ (x, y) = IsoΓ (x, y )γ. (5.4) Using this identity, it is possible to restrict to the case where K is actually a group. With these definitions we can now formulate the two recursion mechanisms introduced by Luks [106]. For the first type of recursion suppose K = Γ ≤ Sym(Ω) is not transitive on W 72 CHAPTER 5. GROUP THEORY

and let W1,...,W` be the orbits of Γ[W ]. Then the strings can be processed orbit by orbit as described in Algorithm 3. This type of recursion follows the same ideas that are presented in the previous section for computing the automorphism group of graphs of bounded color class size. This recursion mechanism is referred to as orbit-by-orbit processing.

Algorithm 3: Orbit-by-Orbit processing Input : Strings x, y:Ω → Σ, a group Γ ≤ Sym(Ω), and a Γ-invariant set W ⊆ Ω such that Γ[W ] is not transitive. W Output: IsoΓ (x, y)

1 compute orbits W1,...,W` of Γ[W ] 2 K := Γ 3 for i = 1, . . . , ` do Wi 4 K := IsoK (x, y) 5 end 6 return K

Wi The set IsoK (x, y) can be computed making one recursive call to the String Isomorphism Problem over domain size ni := |Wi|. Indeed, using Equation (5.4), it can be assumed that K is a group and Wi is K-invariant. Then

Wi IsoK (x, y) = {γ ∈ K | γ[Wi] ∈ IsoK[Wi](x[Wi], y[Wi])} (5.5) where x[Wi] denotes the induced substring of x on the set Wi, i.e., x[Wi]: Wi → Σ: α 7→ x(α) W (the string y[Wi] is defined analogously). So overall, if Γ is not transitive, the set IsoΓ (x, y) can be computed making ` recursive calls over window size ni = |Wi|, i ∈ [`]. For the second recursion mechanism let ∆ ≤ Γ and let T be a transversal for ∆ in Γ. Then

W [ W [ W δ−1 IsoΓ (x, y) = Iso∆δ(x, y) = Iso∆ (x, y )δ. (5.6) δ∈T δ∈T

Luks applied this type of recursion when Γ is transitive (on the window W ), B is a minimal block system for Γ, and ∆ = Γ(B). In this case Γ[B] is a primitive group. Let t = |Γ[B]| be the size of a transversal for ∆ in Γ. Note that ∆ is not transitive (on the window W ). Indeed, each orbit of ∆ has size at most n/b where b = |B|. So by combining both types of recursion the W computation of IsoΓ (x, y) is reduced to t · b many instances of the String Isomorphism Problem over window size |W |/b. This specific combination of types of recursion is referred to as standard Luks reduction. Observe that the time complexity of standard Luks reduction is determined by the size of the primitive group Γ[B]. Overall, Luks’s algorithm is formulated in Algorithm 4. The running time of this algorithm heavily depends on the size of the primitive groups involved in the computation. Luks designed the algorithm to solve the String Isomorphism Problem for Γbd-groups. For fixed d ∈ N, the algorithm runs in polynomial time for groups Γ ∈ Γbd since the size of primitive Γbd-groups is polynomially bounded in the degree of the group.

Before analyzing the structure and size of primitive Γbd-groups (and the running time of Luks’s algorithm) in more detail, first the significance of Γbd-groups is discussed. 5.2. STRING ISOMORPHISM 73

Algorithm 4: Luks’s Algorithm: StringIso(x, y, Γ, γ, W ) Input : Strings x, y:Ω → Σ, a group Γ ≤ Sym(Ω), a permutation γ ∈ Sym(Ω), and a Γ-invariant set W ⊆ Ω Output: {δ ∈ Γγ | ∀α ∈ W : x(α) = y(αδ)}

1 if γ 6= 1 then γ−1 2 return StringIso(x, y , Γ, 1,W )γ /* Equation (5.4) */ 3 end 4 if |W | = 1 then 5 if x(α) = y(α) where W = {α} then 6 return Γ 7 else 8 return ∅ 9 end 10 end 11 if Γ is not transitive on W then 0 12 compute orbit W ⊆ W 0 0 13 return StringIso(x, y, StringIso(x, y, Γ, 1,W ),W \ W ) 14 end 15 compute minimal block system B of the action of Γ on W 16 compute ∆ := Γ(B) 17 compute transversal T of ∆ in Γ S 18 return δ∈T StringIso(x, y, ∆, δ, W ) 74 CHAPTER 5. GROUP THEORY

5.3 Bounded-Degree Graphs and Groups with Restricted Composition Factors

The class of Γbd-groups plays a vital role for studying the Graph Isomorphism Problem for graphs of bounded degree. More precisely, Γbd-groups appear naturally in automorphism groups of graphs of degree at most d and more importantly, the Graph Isomorphism Problem for graphs of maximum degree d can be reduced to the String Isomorphism Problem for Γbd-groups.

Theorem 5.3.1 (Luks [106]). Let G be a connected graph of maximum degree d and suppose v0 ∈ V (G). Then Aut(G, v0) ∈ Γbd.

Proof. For i ≥ 0 define Xi := {w ∈ V (G) | distG(v0, w) ≤ i} and Yi := {w ∈ V (G) | distG(v0, w) = i}. We prove by induction on i ≥ 0 that Aut(G[Xi], v0) ∈ Γbd. For i = 0 the graph G[Xi] only consists of one vertex and hence, Aut(G[Xi], v0) ∈ Γbd. For the inductive step suppose i ≥ 0 and let Γ := Aut(G[Xi+1], v0) ≤ Sym(Xi+1). Since v0 is γ fixed we get that Xi = Xi for every γ ∈ Γ. Now consider the homomorphism ϕ:Γ → Sym(Xi) where each permutation is restricted to Xi. By Lemma 5.1.9 it now suffices to prove that ϕ ϕ ϕ ker(ϕ) ∈ Γbd and Γ ∈ Γbd. But Γ ≤ Aut(G[Xi], v0) and thus, Γ ∈ Γbd by the induction hypothesis and Lemma 5.1.8. On the other hand, for v, w ∈ Yi+1, define v ∼ w if N(v)∩Xi = N(w)∩Xi. Clearly, ∼ defines an equivalence relation on Yi and each equivalence class has size at most d since G has maximum degree d. Moreover, the equivalence classes of ∼ refine the orbit partition of ker(ϕ) = Γ(Xi). Hence, ker(ϕ) ∈ Γbd by Corollary 5.1.10. To finish the proof note that G = G[Xn] and consequently, Aut(G, v0) = Aut(G[Xn], v0) ∈ Γbd.

Actually, one can even prove a slightly stronger statement.

Proposition 5.3.2 (Luks [106]). Let G be a connected graph of maximum degree d and suppose v0v1 ∈ E(G). Then Aut(G, v0, v1) ∈ Γbd−1.

The main intuition is that each vertex v ∈ V (G), v 6= v0, has at most d − 1 neighbors on the next level, since at least one neighbor must be on the previous level. Hence, the equivalence classes of the relation ∼ defined on Yi+1 have size at most d − 1 for all i ≥ 1. The only problem arises for i = 0 since v0 may have d neighbors on the next level. This problem can be resolved by individualizing one of the neighbors of v0. Also, it turns out that the last proposition is optimal in the sense that every group Γ ∈ Γbd−1 can be realized as the automorphism group of such a graph.

Proposition 5.3.3 (Babai, Lov´asz[20]). Let Γ ∈ Γbd−1. Then there is a connected graph of ∼ maximum degree d and an edge v0v1 ∈ E(G) such that Aut(G, v0, v1) = Γ.

The structural insights into the automorphism groups of graphs of maximum degree d can also be turned into a reduction from the Graph Isomorphism Problem for graphs of maximum degree d to the String Isomorphism Problem for Γbd-groups. A reduction of this type was first given by Luks [106]. While the reduction presented in this work runs in polynomial time for every fixed d, the exponent of the polynomial depends linearly on d. Later, Babai and Luks [21] presented a modified reduction that removes this dependence on d. 5.3. BOUNDED DEGREE GRAPHS AND GROUP THEORY 75

Xi−1 ... Ei

Yi

Figure 5.1: Visualization for computing Aut(G[Xi], v0) given Aut(G[Xi−1], v0).

Theorem 5.3.4 (Babai, Luks [21]). There is a polynomial-time Turing-reduction from the Iso- morphism Problem for graphs of maximum degree d to the String Isomorphism Problem for Γbd-groups.

Proof. Let G1 and G2 be two graphs of maximum degree d and suppose d ≥ 3. First, the isomorphism problem for G1 and G2 is reduced to the problem of computing a generating set for the automorphism group Aut(G, v0) of a connected graph G where v0 ∈ V (G). Without loss of generality it can be assumed that G1 and G2 are connected (otherwise the components are treated independently). Let e1 = v1w1 ∈ E(G1) be an arbitrary edge of the first graph. For each e2 = v2w2 ∈ E(G2) it is checked whether there is an isomorphism ϕ: G1 → G2 such that ϕ(v1)ϕ(w1) = v2w2 = e2. For this, let G be the graph with

V (G) = V (G1) ] V (G2) ]{v0, e1, e2} and

E(G) = (E(G1) \{e1}) ∪ (E(G2) \{e2}} ∪ {v0e1, v0e2, e1v1, e1w1, e2v2, e2w2}.

Then there is an isomorphism ϕ: G1 → G2 with ϕ(v1)ϕ(w1) = e2 if and only if there is some γ γ ∈ Aut(G, v0) such that e1 = e2. Given a generating set for Aut(G, v0), the latter can be checked in polynomial time (cf. Theorem 5.1.4). So it remains to consider the task of computing a generating for Aut(G, v0) where G is a connected graph of maximum degree d. For i ≥ 0 define Xi := {w ∈ V (G) | distG(v0, w) ≤ i} and Yi := {w ∈ V (G) | distG(v0, w) = i}. The reduction iteratively computes generating sets for Aut(G[Xi], v0) for all 0 ≤ i ≤ n using an oracle to the String Isomorphism Problem for Γbd-groups. Since G[Xn] = G this gives the desired result. For i = 0 the graph G[Xi] only consists of the vertex v0 and thus, it is trivial to compute a generating set for Aut(G[Xi], v0). So let i ≥ 1 and suppose the algorithm has already computed a generating set for Γi−1 := Aut(G[Xi−1], v0). First note that Γi−1 ∈ Γbd by Theorem 5.3.1. Let Γi := Aut(G[Xi], v0). To compute Γi the algorithm proceeds in three steps. For the first step let di(v) := |N(v) ∩ Xi| for every v ∈ Xi−1. Observe that di(v) ≤ d for all v ∈ Xi−1. Let ∆i := AutΓi−1 (xi) where

xi : Xi−1 → {0, . . . , d}: v 7→ di(v).

The algorithm can compute ∆i using the oracle for the String Isomorphism Problem. Note that Γi[Xi−1] ≤ ∆i. 76 CHAPTER 5. GROUP THEORY

For the second step let

Ei := {e ∈ E(G) | e ∩ Yi−1 6= ∅ ∧ e ∩ Yi 6= ∅} (i.e., the set of edges between level i − 1 and level i). The goal in this step is to compute the automorphism group of Hi := (Xi,E(G[Xi−1]) ∪ Ei). Towards this end, first consider the graph

0 Hi := (Xi−1 ∪ Ei,E(G[Xi−1]) ∪ {ve | e ∈ Ei, v ∈ Xi−1, v ∈ e}).

0 0 0 By extending the group ∆i it is easy to compute ∆i := Aut(Hi, v0). Note that ∆i ∈ Γbd since 0 0 Hi has maximum degree d. Note that the group ∆i may permute the edges from the set Ei 0 0 independent from their endpoints in Yi. This can be resolved as follows. For e, e define e ∼ e 0 if e ∩ e ∩ Yi 6= ∅. Note that ∼ is an equivalence relation that groups the edges in the set Ei according to their endpoints in Yi. Now consider the string

( 0 2 0 1 if e ∼ e yi : E → {0, 1}:(e, e ) 7→ . i 0 otherwise

Using the oracle to the String Isomorphism Problem the algorithm can compute the group 00 0 0 ∆i ≤ ∆i that respects the string yi (considering the natural action of ∆i on ordered pairs 2 restricted to Ei ). Grouping the edges from the set Ei according to their endpoints in Yi it is now easy to compute Aut(Hi, v0). In the final step it remains to compute Γi from Aut(Hi, v0). Note that Γi ≤ Aut(Hi, v0). Consider the string ( 2 1 if vw ∈ E(G) zi : X → {0, 1}:(v, w) 7→ . i 0 otherwise

2 Then Γi = AutAut(Hi,v0)(zi) (where Aut(Hi, v0) acts on Xi in its natural induced action on pairs). Hence, Γi can be computed making another call to the oracle for the String Isomorphism Problem.

Remark 5.3.5. Observe that for the proof to work it is only required that |Xi ∩ N(v)| ≤ d for every v ∈ Xi−1 and every i ∈ [n]. This is a slightly weaker requirement than saying G has maximum degree d since a vertex v ∈ Xi−1 may also have neighbors in the set Xi−1 ∪ Xi−2. Also, in case the edges of the graphs are colored, it suffices that each vertex has at most d incident edges of the same color.

5.4 Primitive Groups

In the last two sections we have introduced Luks’s algorithm tackling the String Isomorphism for Γbd-groups and build a connection to the isomorphism problem for bounded degree graphs showing this problem polynomial-time reduces to the String Isomorphism Problem for Γbd-groups. In order to analyze the running time of Luks’s algorithm the crucial step is to analyze the size of the primitive permutation groups involved. Hence, the goal of this section is to understand the structure of primitive Γbd-groups ultimately resulting in a characterization theorem showing that such groups are small or have a very specific structure. Indeed, this characterization (which was first presented in [69]) not only allows us to bound the running time of Luks’s algorithm, but it also serves as a basis for giving a significantly faster algorithm for the String Isomorphism Problem for Γbd-groups in the next chapter. The analysis of primitive Γbd-groups is based on the well-known O’Nan-Scott Theorem which classifies primitive permutation groups into five different types. 5.4. PRIMITIVE GROUPS 77

5.4.1 The O’Nan Scott Theorem Let Γ be a primitive permutation group acting on a set Ω. As usual, n denotes the size of Ω. By the well known O’Nan-Scott Theorem (see, for example, [45]) the group Γ has to be one of the five types below. In the description of the types, the socle of the group Γ plays a central role.

Definition 5.4.1 (Socle). The socle of Γ, denoted by Soc(Γ), is the subgroup generated by all minimal normal subgroups of Γ.

I. Affine Groups. In this case there is a vector space V over a field of prime order p such that Γ is permutationally equivalent to a group ∆ that satisfies V + ≤ ∆ ≤ AGL(V ), where V + is the additive group of the vector space V and AGL(V ) denotes the affine general linear group for the k vector space V . The socle N := Soc(Γ) of the group Γ is a transitive Abelian group (i.e, Zp for + the prime p and an integer k) and can be identified with V . Furthermore, the stabilizer ∆0 of the 0-vector is an irreducible linear group (i.e., it does not have an invariant subspace).

II. Almost Simple Groups. In this case Soc(Γ) = T is a non-abelian simple group and T ≤ Γ ≤ Aut(T ).

Example 5.4.2. An important class of examples of almost simple primitive permutation groups are Johnson groups. For a finite set Ω let Alt(Ω) be the alternating group defined on the domain Ω. Also, let Am := Alt([m]). Recall that Sm = Sym([m]). m (t) [m] [m] For t ≤ 2 let Sm ≤ Sym( t ) denote the natural induced action of Sm on the set t of (t) [m] all t-element subsets of [m]. Similarly, let Am ≤ Sym( t ) denote the natural induced action [m] of Am on the set t . m (t) (t) For every m ≥ 2 and t ≤ 2 the groups Sm and Am are primitive groups of type II and form the class of the Johnson groups.

∼ III. Simple Diagonal Action. In this case Soc(Γ) = T1 × · · · × Tk where all Ti are isomorphic to some non-abelian simple group T . Additionally, n = |T |k−1, and the stabilizer of some point α ∈ Ω is a diagonal subgroup ∆ ≤ T1 × · · · × Tk.

IV. Product Action. In this case the set Ω can be identified with the k-tuples of some set M. In particular n = |M|k. Furthermore there is some primitive group P ≤ Sym(M) of Type II or III and a transitive group ∆ ≤ Sk such that Γ ≤ P o ∆. The group Γ acts in the natural product action of the wreath product (see Definition 5.1.2). The socle of Γ is Soc(Γ) = T k where T = Soc(P ).

k Example 5.4.3. Let ∆ = Sk and P = Sm for k ≥ 2 and m ≥ 3. Then P o ∆ acts on [m] in the product action of the wreath product (see Definition 5.1.2). This action is primitive.

V. Twisted Wreath Product Action. In this case there is a transitive permutation group 2 k ∆ ≤ Sk and a non-abelian simple group T such that Γ = Λ o ∆ where Λ is isomorphic to T . Furthermore |Ω| = |T |k and Λ acts regularly on Ω.

We analyze the structure of primitive Γbd-groups according to the distinction into these five types. For each of them we will either be interested in a structural description or a bound on the size. To obtain such a bound we use the existence of small bases.

2 Here, Λ o ∆ denotes the semi-direct product of Λ and ∆. For details I refer to [136]. 78 CHAPTER 5. GROUP THEORY

Definition 5.4.4 (Base). Let Γ ≤ Sym(Ω) be a permutation group. A subset B ⊆ Ω is a base for Γ if Γ(B) = {id}. The minimum base size is b(G) := min{|B| | B ⊆ Ω:Γ(B) = {id}}. The minimum base size is related to the order of the group by the equation

2b(Γ) ≤ |Γ| ≤ nb(Γ). (5.7)

5.4.2 Affine Groups

In the following we analyze the structure of large primitive Γbd-groups. This analysis follows the classification or primitive groups into the five types described above and relies on the vast amount of literature available on the structure and size of primitive permutation groups. + k In case Γ is a primitive group of type I it holds that V ≤ Γ ≤ AGL(V ), where V = Fp + ∼ k and V = Zp is the additive group of V , such that the point stabilizer Γ0 of the 0-vector is irreducible. We say the group Γ0 ≤ GL(k, p) acts primitively as a linear group if it does not k preserve any direct sum decomposition V = V1 ⊕ · · · ⊕ V` of the underlying vector space V = Fp. The analysis of primitive affine Γbd-groups is based on a characterization of [102, 103]. The 0 0 0 0 characterization involves the quasi-simple classical groups SLr(q ), SUr(q ), Spr(q ) and Ωr(q ). A group is quasi-simple if it is equal to its own commutator subgroup3 (i.e., perfect) and it is simple modulo its center4. With finitely many exceptions, the mentioned groups are indeed quasi-simple [30, Proposition 1.10.3]. The following theorem is a direct consequence of [102, 103] analyzing the base size of primitive groups of Type I that act primitively as a linear group.

Theorem 5.4.5. There are absolute constants c1, c2, c3 ∈ N such that the following holds. Let Γ ≤ Sym(Ω) be a primitive group of Type I such that Γ0 acts primitively as a linear group. Then

(I) b(Γ) ≤ c1, or

0 0 0 (II) Γ contains a quasi-simple classical group of rank k, more precisely SLk(q ), SUk(q ), Spk(q ) 0 or Ωk(q ), and b(Γ) ≤ c2k + c3, or

(III) Γ contains an alternating group Ak and b(Γ) ≤ c2 log k + c3.

Proof. Let Γ ≤ Sym(Ω) be a primitive group of Type I such that Γ0 acts primitively as a linear group. Now [103, Theorem 1] (which applies to Γ, see the comment after that theorem) states that ∗ b(Γ) ≤ C (for some absolute constant C) or the generalized Fitting group F (Γ0) ≤ Γ contains Qs Qt (∞) i=1 Alt([mi]) · i=1 Cldi (qi) for some integers m1, . . . , ms, q1, . . . qt where s + t ≥ 1. Here 0 0 Cldi (qi) denotes the normalizer of a quasi-simple classical group, namely SLdi (q ), SUdi (q ), 0 0 (∞) 5 Spdi (q ) or Ωdi (q ). Furthermore Cldi (qi) is the last group of the derived series of Cldi (qi). 0 0 0 0 Only finitely many of the groups SLdi (q ), SUdi (q ), Spdi (q ) or Ωdi (q ) are not perfect. Thus, (∞) by referring to Case (I) for the finitely many exceptions, it can be assumed Cldi (qi) contains 0 0 0 0 SLdi (q ), SUdi (q ), Spdi (q ) or Ωdi (q ). ∗ ∗ Let m := max{mi | i ∈ [s]} and d := max{di | i ∈ [t]}. Moreover [103, Proposition 2] states that

3The commutator subgroup of a group Γ is the subgroup [Γ, Γ] generated by all elements [γ, δ] := γ−1δ−1γδ for γ, δ ∈ Γ. 4The center of a group Γ is the normal subgroup Z(Γ) := {γ ∈ Γ | ∀δ ∈ Γ: δγ = γδ}. 5 (n+1) (n) (n) The derived series of a group Γ is the sequence of groups with Γ0 := Γ and Γ := [Γ , Γ ]. The sequence terminates as soon as Γ(n) = Γ(n+1). 5.4. PRIMITIVE GROUPS 79

(i) b(Γ0) ≤ C,

∗ ∗ ∗ (ii) b(Γ0) ≤ 9d + 22 if m ≤ d , or

∗ ∗ ∗ (iii) b(Γ0) ≤ 3 logp m + 22 if m > d .

Since b(Γ) ≤ b(Γ0) + 1 this completes the proof of the theorem.

Lemma 5.4.6 ([38],[92]). Let Γ ∈ Γbd be a simple group of Lie type of rank k or one of the 0 0 0 0 quasi-simple classical groups SLk(q ), SUk(q ), Spk(q ) or Ωk(q ). Then k = O(log d).

Proof. By the definition of Γbd (see Definition 5.1.7), a simple group is in the class Γbd if and only if it is isomorphic to a subgroup of Sd. To prove the lemma it thus suffices to show that the smallest d(k) for which Sd(k) contains one of the groups in question is exponential in k. Cooperstein [38] lists the minimum degree of a permutation representation of the mentioned quasi-simple classical groups. They are all exponential in the rank k. In [92] the minimum degree of a permutation representation is listed for all simple groups of Lie type. Likewise they are exponential in k.

Theorem 5.4.7. Let Γ ∈ Γbd be a primitive permutation group of degree n of Type I. Then b(Γ) = O(log d) and therefore |Γ| = nO(log d).

Proof. Let Γ ∈ Γbd be a primitive permutation group of Type I. This implies that the point k stabilizer Γ0 ≤ GL(k, p), where p = n, is an irreducible linear group. First observe that it suffices to prove b(Γ0) = O(log d) because b(Γ) ≤ b(Γ0) + 1. If Γ0 acts primitively as a linear group this follows by assembling Theorem 5.4.5 and Lemma 5.4.6. Indeed, if Γ0 satisfies Case (I) of Theorem 5.4.5 the claim is obvious. For Case (II) it holds that b(Γ) ≤ c2k + c3 = O(log d) by Lemma 5.4.6 and for Case (III) we know that b(G) ≤ c2 log k + c3 = O(log d). Now suppose Γ0 is irreducible but does not act primitively as a linear group. Then Γ0 can be written as P o ∆ for some primitive linear group P ≤ GL(k/`, p) and transitive group ∆ that k permutes ` subspaces V1,...,V` of V . By [58, Lemma 4.2 (a)] there is a set B1 of O(log d) many k ` points in V such that (Γ0)(B1) ≤ P . (We now follow the techniques from [58, Section 6].) Since P acts primitively as a linear group, by the first part of the proof, there is a base {v1 . . . , vt} of P of size t = O(log d). Now ` let αi be the point (vi, vi, . . . , vi) in V1 × V2 × · · · × V`. Then B2 = {α1, . . . , αt} is a base of P . But this means B1 ∪ B2 is a base for Γ0 of size O(log d).

5.4.3 Non-Affine Groups

Having analyzed primitive Γbd-groups of affine type we continue to analyze the remaining four types of primitive groups. As before, the analysis is based on the available literature on the structure of primitive groups. In order to analyze primitive groups of type II let Out(Γ) denote the outer automorphism group of a group Γ. It is well-known that | Out(Sm)| = 1 and | Out(Am)| = 2 for all m > 6. Lemma 5.4.8. Let Γ be a non-abelian simple group. Then | Out(Γ)| = O(log |Γ|). Proof. For finite simple groups of Lie type, this follows by inspecting the Tables 5 and 6 in the Atlas of Finite Groups [37]. In Table 5 the size of the outer automorphism group is given as the product d · f · g which, according to Table 6, is logarithmic in the size of the group for each simple group of Lie type. For alternating simple groups the statement is obvious. The values for the sporadic groups disappear in the O-notation. 80 CHAPTER 5. GROUP THEORY

(t) Recall that Am denotes the action of the alternating group Am on the set of t-element subsets of [m] (see Example 5.4.2). Theorem 5.4.9 (Liebeck [100]). Let Γ ≤ Sym(Ω) be a primitive group and suppose N = Soc(Γ) is simple. Then one of the following holds:

(t) m 1. N is permutationally equivalent to Am for some m ∈ N and t ≤ 2 ,

2. N is permutationally equivalent to Am acting on the set of equipartitions of [m] into subsets of size b (for some 1 < b < m), 3. N is a classical simple group acting on an orbit of subspaces of the natural module or pairs of subspaces of complementary dimension, or 4. |Γ| ≤ n9. We shall not exploit the structure of the action of N in Case 3 of the theorem, and rather only use that N is a simple group of Lie type.

Lemma 5.4.10. Let Γ ≤ Sym(Ω) be a primitive Γbd-group of Type II. Let N := Soc(Γ). Then one of the following holds:

(t) m 1. N is permutationally equivalent to Am for some m ≤ d and t ≤ 2 and |Γ: N| ≤ 2, or 2. |Γ| = nO(log d). Proof. The proof is based on Theorem 5.4.9. First suppose N is permutationally equivalent to (t) m Am for some m ∈ N and t ≤ 2 . Then m ≤ d since N ∈ Γbd by Lemma 5.1.8. Furthermore |Γ: N| ≤ | Out(N)| ≤ 2 since N is an alternating group (in case m ≤ 6 the second option is satisfied). Next consider the case that N is permutationally equivalent to Am acting on partitions of [m] into subsets of size b. Again, m ≤ d and |Γ: N| ≤ 2 (or m ≤ 6 in which case the second option is m! satisfied). Also n = (b!)aa! where a · b = m. Using Stirling’s approximation it can be calculated that n = 2Ω(m). Hence, |N| ≤ mm = nO(log m) = nO(log d) and consequently |Γ| = nO(log d). It remains to analyze the third case. It suffices to show that |N| = nO(log d) since this implies O(log d) |Γ| = |N||Γ: N| ≤ |N|| Out(N)| = |N|O(log(|N|)) = n . Let ϕ: N → Sd0 be a minimal permutation representation of N (i.e., d0 is as small as possible). Clearly d0 ≤ n and moreover, 0 d ≤ d since N is a simple Γbd-group (cf. Lemma 5.1.8). Moreover, being minimal, the action is faithful and primitive since N is simple. Not being an alternating group, the group N is not a 0 Cameron group which means |N| ≤ (d0)1+log (d ) = nO(log d) [33, 109].

Lemma 5.4.11 (Gluck, Seress, Shalev [58]). Let Γ ∈ Γbd be a primitive group of Type III. Then b(Γ) ≤ 2` + 1 and thus |Γ| ≤ n2`+1 = nO(log d) where ` := max{5, dlog de}. Remark on the proof. While not explicitly stated in [58], the Lemma is implicit from [58, Lemma 4.2 (c)] and the comment in Section 6 on Type III (a) in [58].

Lemma 5.4.12. Let Γ ≤ Sym(Ω) be a primitive Γbd-group of Type IV. Let N := Soc(Γ). Then one of the following holds:

1. Γ ≤ P o∆ is a wreath product in the product action of a transitive group ∆ ≤ Sk with ∆ ∈ Γbd and a primitive group P of Type II with socle T := Soc(P ) permutationally equivalent to (t) m k 1+log d Am for some m ≤ d and t ≤ 2 , and N is isomorphic to T with |Γ: N| ≤ n , or 5.4. PRIMITIVE GROUPS 81

2. |Γ| = nO(log d).

Proof. Suppose Γ ∈ Γbd is a primitive group of Type IV. Then Γ ≤ P o ∆ for some primitive group P ≤ Sym(M) and a transitive group ∆ ≤ Sk. First observe that both P ∈ Γbd and ∆ ∈ Γbd by Lemma 5.1.8. Let Λ = P k. Then |Γ:Λ| = |∆| ≤ dk−1 ≤ 2k·log d ≤ nlog d by Lemma 5.1.11. Moreover |Γ| ≤ nlog d · |P |k. First suppose |P | = |M|O(log d). Then |Γ| ≤ nlog d · (|M|O(log d))k = nO(log d). Thus, by Lemma 5.4.11 and 5.4.10, it suffices to consider the case where P is a primitive group of Type II satisfying Part 1 of Lemma 5.4.10. This means the socle T := Soc(P ) is permutationally (t) m k k equivalent to Am for some m ≤ d and t ≤ 2 and |P : T | ≤ 2. Note that N = T ≤ P = Λ. Thus, |Γ: N| = |Γ:Λ| · |P : T |k ≤ nlog d · 2k ≤ n1+log d.

1+log d Lemma 5.4.13. Let Γ ∈ Γbd be a primitive group of Type V. Then |Γ| ≤ n . Proof. For a primitive group Γ of Type V, a primitive twisted wreath product, there is a transitive ∼ k k group ∆ ≤ Sk and a non-abelian simple group T such that Γ = T o∆. Moreover, n = |Ω| = |T | k−1 which implies that k ≤ log(n). Note that ∆ ∈ Γbd by Lemma 5.1.8 and thus |∆| ≤ d by Lemma 5.1.11. Overall, |Γ| = |T k| · |∆| = n · |∆| ≤ n · dk−1 ≤ n · dlog n = n1+log d.

5.4.4 A Characterization Theorem

After analyzing the structure of primitive Γbd-groups for all five types the results can be combined into a structure theorem. For this two auxiliary lemmata are required. Lemma 5.4.14. Let Γ ≤ Sym(Ω) be a transitive group and α ∈ Ω. Then

Γα Bα := {β ∈ Ω | β = {β}} forms a block of Γ.

Proof. Let R := {(α, β) ∈ Ω2 | βΓα = {β}}. First, it is shown that R is an equivalence relation. Clearly the relation R is reflexive and transitive. Suppose that (α, β) ∈ R. Then Γα ≤ Γβ. Γ Γ Moreover, |Γα| = |Γ|/|α | = |Γ|/|β | = |Γβ| since Γ is transitive. It follows that Γα = Γβ and thus, (β, α) ∈ R. So R is also symmetric and hence, R is an equivalence relation. γ γ −1 Now let (α, β) ∈ R and γ ∈ Γ. Then (α , β ) ∈ R because Γαγ = γ Γαγ. Thus, R is invariant under Γ and the partition into equivalence classes forms a block system for Γ. Lemma 5.4.15. Let P ≤ Sym(Ω) be a non-regular primitive group and k ≥ 2. Let B be a block system of P k with its natural action on Ωk (see Definition 5.1.1). Then there is some I ⊆ [k] such that k |I| B = {{(α1, . . . , αk) ∈ Ω | ∀i ∈ I : αi = βi} | (βi)i∈I ∈ Ω }.

Proof. Let B ∈ B be a block and let I = {i ∈ [k] | |πi(B)| = 1} where πi(B) = {αi | (α1, . . . , αk) ∈ B}. For every i ∈ I suppose πi(B) = {βi}. It suffices to show that B = k 0 0 {(α1, . . . , αk) ∈ Ω | ∀i ∈ I : αi = βi}. Let j ∈ [k] \ I and let (α1, . . . , αk), (α1, . . . , αk) ∈ B such 0 0 γ 0 that αj 6= αj. Since P is non-regular and primitive there is some γ ∈ Pαj such that (aj) 6= αj 0 0 0 γ 0 0 (see Lemma 5.4.14). Note that (α1, . . . , αj−1, (aj) , αj+1, . . . , αk) ∈ B. Let A := {α ∈ Ω | 0 0 0 0 (α1, . . . , αj−1, α, αj+1, . . . , αk) ∈ B}. Since A forms a block of P and |A| ≥ 2 we get that A = Ω. k This implies that B = {(α1, . . . , αk) ∈ Ω | ∀i ∈ I : αi = βi}.

Let Γ ≤ Sym(Ω) and let B1, B2 be two Γ-invariant partitions such that B1 ≺ B2. Consistent with the previous notation let ΓB[B1[B]] denote the natural induced action of ΓB on the set B1[B] for all B ∈ B2. 82 CHAPTER 5. GROUP THEORY

Theorem 5.4.16. Let Γ ≤ Sym(Ω) be a primitive Γbd-group. Then one of the following holds:

c1 log d + c2 1. |Γ| ≤ n for some absolute constants c1, c2, or

2. for N := Soc(Γ) E Γ there is a sequence of partitions {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} such that

(a) |Γ: N| ≤ n1+log d,

(b) Bi is N-invariant for all i ∈ [k], and m m (c) there are m ≤ d and t ≤ 2 with m > 4 log s where s := t such that for all i ∈ [k] (t) and B ∈ Bi−1 the group NB[Bi[B]] is permutationally equivalent to Am .

Moreover, there is a polynomial-time algorithm that determines one of the options that is satisfied and in case of the second option computes N and the partitions B0,..., Bk.

Proof. First suppose Γ is a primitive group of Type I, III or V. Then Option 1 holds by Theorem 5.4.7, Lemma 5.4.11 and 5.4.13, respectively. So it remains to consider primitive groups of Type II and IV. Let N := Soc(Γ) be the socle of Γ. Consider the case that Γ is a primitive group of Type O(log d) (t) II. Then, by Lemma 5.4.10, |Γ| = n or N is permutationally equivalent to Am for some m m m m ≤ d and t ≤ 2 and |Γ: N| ≤ 2. In case m ≤ 4 log t = 4 log n, it holds that |N| ≤ m ≤ 4 log n 4 log m 4 log d O(log d) m m = n ≤ n and thus, |Γ| = n . In case m > 4 log t let B0 := {Ω} and B1 := {{α} | α ∈ Ω}. Next assume Γ is of Type IV. By Lemma 5.4.12 it suffices to consider the case where Γ ≤ P o∆ is a wreath product in the product action for a transitive group ∆ ≤ Sk in Γbd and a group (t) P ≤ Sym(M) of Type II with a socle T permutationally equivalent to Am for some m ≤ d and m k 1+log d t ≤ 2 , and N is isomorphic to T with |Γ: N| ≤ n . Moreover, in case m ≤ 4 log |M| it holds that |T | ≤ m! ≤ |M|4 log m and hence, |N| ≤ n4 log d. This implies |Γ| = nO(log d). So m assume m > 4 log |M| = 4 log t Observe that, since the wreath product is in the product action, an element h = (p1, . . . , pk) ∈ k k h p1 pk P acts on an element (m1, . . . , mk) ∈ M = Ω via (m1, . . . , mk) = (m1 , . . . , mk ). For i ∈ [k] define k ∗ ∗ ∗ Bi := {{(m1, . . . , mk) ∈ M | ∀j ≤ i: mj = mj } | m1, . . . , mi ∈ M}.

Clearly, Bi is an N-invariant partition for all i ∈ [k]. Observe that NB[Bi[B]] is permutationally (t) equivalent to T for all i ∈ [k] and B ∈ Bi−1 which itself is permutationally equivalent to Am . So it remains to prove the algorithmic part of the theorem. First observe that |Γ| can be computed in polynomial time by Theorem 5.1.4, so Option 1 can be detected. Also note that the socle of a group is a normal subgroup and can be computed in polynomial time (see [87]). The algorithm now sets B0 = {Ω}. To compute Bi from Bi−1 the algorithm picks an 0 arbitrary block B ∈ Bi−1 and computes a maximal block B within B, that is a block that is 0 0 N inclusionwise maximal with the property that B ( B. Then Bi := (B ) . Note that, up to permuting the coordinates, by Lemma 5.4.15 the block systems described above are the only block systems of N. Hence every sequence of block systems {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} that cannot be extended has the desired properties. Finally note that the algorithm is also correct for groups of Type II. Indeed, in this case N is primitive and thus, the algorithm produces the sequence B0 := {Ω} and B1 := {{α} | α ∈ Ω}. 5.4. PRIMITIVE GROUPS 83

Besides giving a characterization of the large primitive Γbd-groups the results obtained in the previous subsections analyzing the structure and size of primitive Γbd-groups according to their types can also be used to give a simple proof of the following overall bound on the size of primitive Γbd-groups.

Theorem 5.4.17. There is a function f : N → N such that for every primitive permutation group f(d) Γ ∈ Γbd it holds that |Γ| ≤ n . Moreover, the function f can be chosen such that f(d) = O(d). The first part of this theorem on the existence of the function f was first proved by Babai, Cameron and P´alfy[13] implying that Luks’s algorithm runs in polynomial time for every fixed number d ∈ N 6. Later, it was observed that the function f can actually be chosen to be linear in d (see e.g. [101]). Proof. First suppose Γ is a primitive group of Type I, III or V. Then the statement holds by Theorem 5.4.7, Lemma 5.4.11 and 5.4.13, respectively. Let N := Soc(Γ). Next suppose Γ is a primitive group of Type II. For Option 1 of Lemma 5.4.10 it holds that n ≥ d and Γ ≤ dd. So the statement follows from Lemma 5.4.10. Finally, suppose Γ is a primitive group of type IV which means that Γ ≤ P o ∆ for transitive k group ∆ ≤ Sk, a primitive P ≤ Sym(M), and |Ω| = |M| . Assume Option 1 of Lemma 5.4.12 is satisfied. Then |P | ≤ |M|d and |N| ≤ |P |k ≤ |M|kd. So overall |Γ| = |N| · |Γ: N| ≤ |M|kd · n1+log d = nd+1+log d = nO(d). Again, the statement follows from Lemma 5.4.12. This theorem is particularly important as it enables the analysis of Luks’s algorithm (see also [19, 21]).

O(d) Corollary 5.4.18. Luks’s Algorithm (Algorithm 4) for groups Γ ∈ Γbd runs in time n . Proof Sketch. In order to analyze the running time of Algorithm 4 let f(n) denote the maximal number of leaves in a recursion tree for a Γbd-group where n := |W | denotes the window size. For 0 n = 1 it is easy to see that f(n) = 1. Suppose Γ is not transitive on W and let n1 := |W | be the size of an orbit W 0 ⊆ W . Then

f(n) ≤ f(n1) + f(n − n1). (5.8)

Finally, if Γ is transitive on W , the algorithm computes a minimal block system B of Γ[W ] and performs standard Luks reduction. In this case the algorithm performs t · b recursive calls over window size n/b where t = |Γ[B]|. Hence,

f(n) ≤ t · b · f(n/b). (5.9)

O(d) Since |Γ[B]| is a primitive Γbd-group it holds that t = b by Theorem 5.4.17. Combining the bound on t with Equation (5.8) and (5.9) gives f(n) = nO(d). Also, each node of the recursion tree only requires computation time polynomial in n and the number of children in the recursion tree (cf. Theorem 5.1.4). Overall, this gives the desired bound on the running time. Corollary 5.4.19. The Graph Isomorphism Problem for graphs of maximum degree d can be solved in time nO(d). Proof. This follows from Theorem 5.3.4 and Corollary 5.4.18. 6At the time Luks developed his original polynomial-time isomorphism test for graphs of bounded degree [106] Theorem 5.4.17 was not available. To be more precise, no polynomial bound on the size of primitive Γbd-groups of Type I was available. Instead, Luks proved a polynomial bound on the index of a Sylow subgroup to which Algorithm 4 is applied. 84 CHAPTER 5. GROUP THEORY

Remark 5.4.20. Let Γd denote the family of groups Γ with the property that Γ has no alternating composition factors of degree greater than d and no classical composition factors of rank greater than d. (There is no restriction on the cyclic, exceptional, and sporadic composition factors of Γ.) While the class Γbd considered in this thesis follows the original definition of Luks [106], most of the recent literature is concerned with the more general class of groups Γd [13, 58]. The reason is that many results that can be proved for the class Γbd indeed carry over to the more general class of groups Γd. For example, this is true for Theorem 5.4.17 (see [101]) and consequently, it also holds for Corollary 5.4.18. However, this is not the case for Theorem 5.4.16. Indeed, consider the affine general linear group Γ = AGL(d, p) of dimension d (with its natural action on the corresponding vector space). Then Γ is a primitive group of affine type and |Γ| = nΩ(d) where n = pd is the size of the vector space. For this group Theorem 5.4.16 does not hold. The group Γ is contained in the class Γd, but it is not contained in Γbd.

5.5 String Isomorphism in Quasipolynomial Time

Luks’s algorithm was the first algorithm using advanced methods from group theory to tackle the Graph Isomorphism Problem and demonstrated the power of group-theoretic tools in this context. Adding a combinatorial method due to Zemlyachenko [151] it results in an isomorphism √ test running in time 2O( n log n) [19] which was the best known algorithm for over three decades. Also, the methods developed by Luks served as a subroutine for a number of further algorithms tackling the isomorphism problem for certain graph classes (see, e.g., [14, 67, 96, 122]). But maybe more importantly, the recursion techniques introduced by Luks (see Subsection 5.2.3) also form the basis for Babai’s recent quasipolynomial-time isomorphism test [11]. Besides the advances on combinatorial techniques related to the Weisfeiler-Leman algorithm, for his quasipolynomial-time algorithm Babai also added a new group-theoretic tool, the Lo- cal Certificates Routine, which is based on the Unaffected Stabilizers Theorem. In relation to Luks’s algorithm, it seems natural to ask whether the techniques developed by Babai can also be exploited to give improved algorithms for the String Isomorphism Problem for Γbd-groups and, consequently, for the Graph Isomorphism Problem for graphs of bounded degree. In order to explore this question we first present the main technical results from [11] focusing on the group-theoretic insights that form the groundwork of Babai’s quasipolynomial-time algorithm.

c Theorem 5.5.1 (Babai [11]). The String Isomorphism Problem can be solved in time nO((log n) ) for some constant c.

In combination with Theorem 5.2.3 this immediately implies the Graph Isomorphism Problem can be solved within the same time bound.

c Corollary 5.5.2 (Babai [11]). The Graph Isomorphism Problem can be solved in time nO((log n) ) for some constant c.

Actually, generalizing the last corollary, one can also obtain an isomorphism test for relational structures.

Corollary 5.5.3. There is an algorithm that, given two t-ary relational structures A1, A2, com- putes a representation for the set ∼ Iso(A1, A2) = {σ : V (A1) → V (A2) | σ : A1 = A2}

c c in time nO(t (log n) ) for some absolute constant c. 5.5. STRING ISOMORPHISM IN QUASIPOLYNOMIAL TIME 85

Proof. This follows from combining an isomorphism-preserving translation from relational struc- tures to graphs of size nO(t) (see, e.g., [115]) and the quasipolynomial time isomorphism test from Corollary 5.5.2.

Another closely related problem that appears again later in this thesis is the Coset Intersection Problem. The Coset Intersection Problem asks, given two groups Γ1, Γ2 ≤ Sym(Ω) and γ1, γ2 ∈ Sym(Ω), whether there is a common element in Γ1γ1 and Γ2γ2, i.e., whether Γ1γ1 ∩ Γ2γ2 6= ∅. Note that if Γ1γ1 ∩ Γ2γ2 6= ∅ then the intersection is a coset of the group Γ1 ∩ Γ2, i.e., Γ1γ1 ∩ Γ2γ2 = (Γ1 ∩ Γ2)γ for every γ ∈ Γ1γ1 ∩ Γ2γ2. In particular, the intersection of the cosets can be represented efficiently either by the empty set or by a generating set for Γ1 ∩ Γ2 and an arbitrary element γ ∈ Γ1γ1 ∩Γ2γ2. The Coset Intersection Problem is polynomial-time equivalent to the String Isomorphism Problem under many-one reductions (see [108]) which implies the next corollary.

Corollary 5.5.4. There is an algorithm that, given Γ1, Γ2 ≤ Sym(Ω) and γ1, γ2 ∈ Sym(Ω), O((log n)c) computes a representation for the set Γ1γ1 ∩ Γ2γ2 in time n for some constant c.

In order to build a quasipolynomial-time algorithm for the String Isomorphism Problem, Babai follows Luks’s algorithm attacking the obstacle cases where the recursion performed by Luks’s algorithm does not result in the desired running time. Indeed, this is the case exactly if Luks’s algorithm encounters a primitive group of size bigger than quasipolynomial in the degree of the group. Using the classification of large primitive groups by Cameron [33] such a group is necessarily a Cameron group involving a large alternating group in a Johnson action7. Actually, the following theorem is sufficient for the design of a quasipolynomial-time algorithm.

Theorem 5.5.5 (cf. [10], Theorem 3.2.1). Let Γ ≤ Sym(Ω) be a primitive group of order |Γ| ≥ n1+log n where n is greater than some absolute constant. Then there is a polynomial- time algorithm computing a normal subgroup N E Γ of index |Γ: N| ≤ n and an N-invariant (t) equipartition B such that N[B] is permutationally equivalent to Am for some m ≥ log n.

The mathematical part of this theorem follows from [33, 109] whereas the algorithmic part is resolved in [22]. In order to handle the large primitive groups described by the last theorem Babai’s algo- rithm utilizes several subroutines based on both group-theoretic techniques and combinatorial approaches like the Weisfeiler-Leman algorithm. In this thesis, we are primarily interested in the group-theoretic techniques, more precisely, the Local Certificates Routine which is based on two group theoretic statements, the Unaffected Stabilizers Theorem and the Affected Orbit Lemma. Recall that for a set M we denote by Alt(M) the alternating group acting with its standard action on the set M. Moreover, following Babai [11], we refer to the groups Alt(M) and Sym(M) as the giants where M is an arbitrary finite set. Let Γ ≤ Sym(Ω). A giant representation is a ϕ homomorphism ϕ:Γ → Sk such that Γ ≥ Ak. Given a string x:Ω → Σ, a group Γ ≤ Sym(Ω) and a giant representation ϕ:Γ → Sk, the ϕ aim of the Local Certificates Routine is to determine whether (AutΓ(x) ) ≥ Ak and to compute a meaningful certificate in both cases. To achieve this goal the central tool is to split the set Ω into affected and non-affected points.

Definition 5.5.6 (Affected Points, Babai [11]). Let Γ ≤ Sym(Ω) be a group and ϕ:Γ → Sk a ϕ giant representation. Then an element α ∈ Ω is affected by ϕ if Γα 6≥ Ak. 7This refers to the natural induced action of an alternating Alt(Ω) on the set of t-element subsets of Ω for some t ≤ |Ω| (see also Example 5.4.2). 86 CHAPTER 5. GROUP THEORY

Remark 5.5.7. Let ϕ:Γ → Sk be a giant representation and suppose α ∈ Ω is affected by ϕ. Then every element in the orbit αΓ is affected by ϕ. The set αΓ is called an affected orbit (with respect to ϕ). The correctness and the analysis of the running time of the Local Certificates Routine rest on the following two statements. Theorem 5.5.8 (Unaffected Stabilizers Theorem, Babai [11]). Let Γ ≤ Sym(Ω) be a permutation group of degree n and let ϕ:Γ → Sk be a giant representation such that k > max{8, 2 + log n}. Let D ⊆ Ω be the set of elements not affected by ϕ. ϕ Then (Γ(D)) ≥ Ak. In particular D 6= Ω, that is, at least one point is affected by ϕ. Lemma 5.5.9 (Affected Orbit Lemma, Babai [11]). Let Γ ≤ Sym(Ω) be a permutation group and suppose ϕ:Γ → Sk is a giant representation for k ≥ 5. Suppose A ⊆ Ω is an affected orbit of Γ (with respect to ϕ). Then every orbit of ker(ϕ) in A has size at most |A|/k. Before discussing the Local Certificates Routine in more detail (see Section 6.4) let us return to the question whether the methods used in Babai’s quasipolynomial-time isomorphism test can also be exploited for the design of a more efficient algorithm testing isomorphism of bounded- degree graphs. Indeed, by the Characterization Theorem for primitive Γbd-groups (Theorem 5.4.16) the bottleneck cases of Luks’s algorithm for Γbd-groups are similar in their structure to the obstacles described in Theorem 5.5.5. This suggests it might be possible to generalize Babai’s algorithm to the setting of bounded-degree graphs. However, there is a second major hurdle which relates to the Unaffected Stabilizers Theorem. In order to translate Babai’s algorithm to the setting of bounded-degree graphs one needs to adapt the Unaffected Stabilizers Theorem for Γbd-groups also allowing for values k = O(log d). Unfortunately, this is not easily possible as the next example demonstrates.

Example 5.5.10. Let d ≥ 3. Consider the group Γ = S2 oSd in its product action, that is, Γ ≤ Ω for Ω = {0, 1}d (see Definition 5.1.2). One may observe that Γ is the automorphism group of the graph of the d-dimensional hypercube. The group Γ is a Γbd-group. Let ϕ:Γ → Sd be the homomorphism defined by ϕ((τi)i∈[d], γ) = γ. Clearly ϕ is a giant representation. Moreover, no point α ∈ Ω is affected by ϕ. To prove this consider the point α = (0,..., 0) ∈ {0, 1}d. Then ϕ Γα = {((id)i∈[d], γ) | γ ∈ Sd} where id denotes the identity element. So Γα = Sd. Since Γ is transitive it follows that no point is affected by ϕ (see Remark 5.5.7). Let D be the elements not affected by ϕ. Then D = Ω and thus Γ(D) = {id}. In particular ϕ (Γ(D)) 6≥ Ak. The size of the permutation domain is n = |Ω| = 2d and thus, d = log n ≤ 2 + log n. This shows that the bound for k given in the Unaffected Stabilizers Theorem is essentially tight (actually, by slightly modifying the example, one can prove that the bound in the Unaffected Stabilizers Theorem is tight (see [10])). So overall, the group Γ provides an example where the methods relying on the Unaffected Stabilizers Theorem are not directly applicable. However, it is also notable that the group Γ itself does not pose an obstacle case regarding the String Isomorphism Problem for Γbd-groups. Indeed, the size of Γ is given by |Γ| = 2d · d! ≤ 2d · 2d log d = n1+log d. Moreover, the group Γ has two non-trivial block systems. The first block system consists of two blocks B1 = {B0,B1} where

d X Bi = {(b1, . . . , bd) ∈ {0, 1} | bj ≡ i mod 2} j∈[d]

The second block system contains all blocks {(b1, . . . , bd), (1 − b1,..., 1 − bd)} for (b1, . . . , bd) ∈ {0, 1}d. As a result, also all primitive groups P occurring in Luks’s algorithm (Algorithm 4) 5.5. STRING ISOMORPHISM IN QUASIPOLYNOMIAL TIME 87 applied to Γ are small, i.e., have size |P | ≤ (4m)1+log d where m is the size of the permutation domain of P . While the group given in the last example does not pose a problem for Luks’s algorithm, it still indicates that it is not possible to simply translate Babai’s techniques to the setting of Γbd-groups with one central problem being to obtain a suitable variant of the Unaffected Stabilizers Theorem. In the next chapter we present methods to get around these problems ultimately leading to an polylog(d) algorithm solving the String Isomorphism Problem for Γbd-group in time n . Intuitively, the idea is to first normalize the input in order to eliminate problematic groups hindering us from applying a variant of the Unaffected Stabilizers Theorem. For the normalization we exploit that, is some sense, the problematic groups are simple to handle as also indicated in the last example. Then, the normalized inputs allow for a generalization of the techniques introduced by Babai for his quasipolynomial-time algorithm. 88 CHAPTER 5. GROUP THEORY Chapter 6

Isomorphism for Bounded Degree Graphs

As already indicated in the previous chapter, Luks’s algorithm [106] is one of the cornerstones of the algorithmic theory of the Graph Isomorphism Problem. With a slight improvement given O(d/ log d) later [19] it solves the String Isomorphism Problem for Γbd-groups in time n and thus, the Graph Isomorphism Problem for graphs of degree d can be solved in the same time. Over the past decades Luks’s algorithm has been used as a basic building block for many algorithms. In some cases Luks’s algorithm is simply used as a black box (e.g., [21, 67, 96]), and in other cases algorithms exploit the recursion techniques introduced by Luks (e.g., [14, 122]), sometimes in combination with an extension of the algorithm to hypergraphs given by Miller [117]. The most important example in the latter direction is Babai’s quasipolynomial time algorithm [11] which builds on the recursion framework introduced by Luks attacking the obstacle cases where the recursion of Luks’s algorithm does not give the desired running time. Naturally, this raises the question whether the novel techniques introduced in Babai’s work can also be exploited to obtain a more efficient isomorphism test for graphs of bounded degree. In this chapter we give a positive answer to this question presenting an algorithm solving the String Isomorphism Problem polylog(d) for Γbd-groups in time n . This result can also be found in [69].

6.1 Structure Trees

A key method of the quasipolynomial time isomorphism test from [11] is the Local Certificates Routine which rests on a group-theoretic statement, the Unaffected Stabilizers Theorem (see Theorem 5.5.8). One of the central obstacles for building a more efficient isomorphism test for bounded degree graphs is the adaption of this theorem to the setting of Γbd-groups. As already outlined in Example 5.5.10 this is not straight-forward since a natural adaption does not hold for Γbd-groups. The main idea to circumvent this problem is a normalization of the input to the String Isomorphism Problem in order to eliminate the problematic groups. Indeed, analyzing the proof of the Unaffected Stabilizers Theorem, it can be observed that groups are decomposed along a non-trivial block system which essentially means one needs to study primitive groups. Thus, the basic strategy for the normalization is to change the action of the input group along a fixed sequence of block systems. The key tool for this normalization are tree unfoldings of certain graphs defined with respect to the input group. This allows us to modify the action of the group in such a way as to obtain many block systems of the group that severely restrict possible

89 90 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS primitive groups present in the normalized group action (with respect to a certain sequence of block systems).

6.1.1 Sequences of Partitions and Structure Trees In order to achieve the normalization the first step is to formalize the desired outcome of the normalization process. As indicated above, the crucial property of the normalized groups is a restriction on the primitive groups involved along a fixed sequence of partitions. However, since the main tools in this section are graph-theoretic, it is more convenient to view such sequences of block systems as graphs leading to the notion of a structure tree. A rooted tree is a pair (T, v0) where T is a directed tree and v0 ∈ V (T ) is the root of T (all edges are directed away from the root). Let L(T ) denote the set of leaves of T , i.e., vertices v ∈ V (T ) without outgoing edges. For v ∈ V (T ) we denote by T v the subtree of T rooted at vertex v.

Definition 6.1.1 (Structure Tree). Let Γ ≤ Sym(Ω) be a permutation group. A structure tree for Γ is a rooted tree (T, v0) such that L(T ) = Ω and Γ ≤ (Aut(T ))[Ω].

Lemma 6.1.2. Let Γ ≤ Sym(Ω) be a transitive group and (T, v0) a structure tree for Γ. For every v ∈ V (T ) the set L(T v) is a block of Γ. Moreover, {L(T w) | w ∈ vAut(T )} forms a block system of the group Γ.

σ v σ(v) Proof. Let γ ∈ Γ and let σ ∈ Aut(T, v0) such that σ[Ω] = γ. If v = v then L(T ) = L(T ) = (L(T v))σ = (L(T v))γ . Otherwise vσ 6= v and L(T v) ∩ (L(T v))γ = ∅. Hence, L(T v) is a block of Γ. The argument also shows that {L(T w) | w ∈ vAut(T )} = {(L(T v))γ | γ ∈ Γ} forms a block system.

Let Γ ≤ Sym(Ω) be a transitive group. The last lemma implies that every structure tree (T, v0) gives a sequence of Γ-invariant partitions {Ω} = B0 ... Bk = {{α} | α ∈ Ω}. On the other hand, every such sequence of partitions gives a structure tree (T, v0) with [ V (T ) = Ω ∪ Bi i=0,...,k−1 and 0 0 0 E(T ) = {(B,B ) | B ∈ Bi−1,B ∈ Bi,B ⊆ B} ∪ {(B, α) | B ∈ Bk−1, α ∈ B}.

The root is v0 = Ω. Hence, there is a one-to-one correspondence between structure trees and sequences of Γ-invariant partitions. In the following, both view points are used interchangeably depending on the current task. A sequence of Γ-invariant partitions {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} is maximal if ΓB[Bi[B]] is primitive for every i ∈ [k] and B ∈ Bi−1. Suppose {Ω} = B0 ... Bk = {{α} | α ∈ Ω} is not maximal and let A ⊆ Bi be a non-trivial block of ΓB[Bi[B]] for some i ∈ [k] and ∗ S ∗ ∗ Γ B ∈ Bi−1. Then B = A∈A A ⊆ Ω is a block of Γ and B = (B ) a block system of Γ such ∗ that Bi−1 B Bi. As a result, every sequence of Γ-invariant partitions can be extended to a sequence that is maximal. As described above the goal of the normalization process is to restrict the groups appearing along a specific sequence of block systems, i.e., the groups ΓB[Bi[B]] for B ∈ Bi−1 and a sequnce of invariant partitions {Ω} = B0 · · · Bk = {{α} | α ∈ Ω}. The following definition formalizes the desired outcome of this normalization. Recall that a permutation group Γ ≤ Sym(Ω) is semi-regular if Γα = {id} for all α ∈ Ω. 6.1. STRUCTURE TREES 91

Definition 6.1.3 (Almost d-ary Sequences of Partitions). Let Γ ≤ Sym(Ω) be a group and let {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} be a sequence of Γ-invariant partitions. The sequence B0 · · · Bk is almost d-ary if for every i ∈ [k] and B ∈ Bi−1 it holds that

1. |Bi[B]| ≤ d, or

2.Γ B[Bi[B]] is semi-regular.

If the first option is always satisfied the sequence B0 · · · Bk is called d-ary. Similarly, a structure tree (T, v0) for a group Γ is (almost) d-ary if the corresponding sequence of partitions is.

For a group Γ ≤ Sym(Ω) and a structure tree (T, v0) let ϕ:Γ → Aut(T, v0) be the unique homomorphism such that (γϕ)[Ω] = γ for all γ ∈ Γ (this homomorphism is unique since + Aut(T, v0)(Ω) = {id}). Also let N (v) := {w ∈ V (T ) | (v, w) ∈ E(T )} be the set of chil- dren of v for every v ∈ V (T ). Then (T, v0) is almost d-ary if for every v ∈ V (T ) it holds that + ϕ + |N (v)| ≤ d or (Γ )v[N (v)] is semi-regular. Moreover, the structure tree (T, v0) is d-ary if the underlying rooted tree is d-ary, i.e., |N +(v)| ≤ d for all v ∈ V (T ). Intuitively, this means the groups appearing along a given sequence of invariant partitions for the normalized action are (permutationally equivalent to) subgroups of Sd or they are semi- regular. Observe that there are Γbd-groups that have no almost d-ary structure tree. In particular, this includes the permutation groups given in Example 5.5.10. Hence, in the remainder of this section the goal is modify the action of an input group ∗ ∗ Γ ≤ Sym(Ω), Γ ∈ Γbd, so that the normalized action Γ ≤ Sym(Ω ) has an almost d-ary sequence of partitions. This is achieved by slightly enlarging the permutation domain. Of course, a trivial way to achieve this is by simply considering the standard regular action of the permutation group. However, in this case |Ω∗| = |Γ| which is not affordable in the desired time frame. Instead, the normalization process described in this thesis achieves a size of the permutation domain |Ω∗| = nO(log d). Before presenting the normalization of the action some properties regarding (almost) d-ary sequences of partitions are discussed. A simple, but crucial observation is that such sequences are inherited by subgroups and restrictions of the action to invariant subsets.

Observation 6.1.4. Let Γ ≤ Sym(Ω) be a group, and let {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} be a (almost) d-ary sequence of Γ-invariant partitions. Moreover, let ∆ ≤ Γ. Then B0 · · · Bk also forms a (almost) d-ary sequence of ∆-invariant partitions. Additionally, for a ∆-invariant subset S ⊆ Ω it holds that B0[S]  · · ·  Bm[S] forms a (almost) d-ary sequence of ∆[S]-invariant partitions.

Also, for Γbd-groups it is not difficult to argue that the existence of an almost d-ary sequence of partitions is equivalent to the existence of a d-ary sequence of partitions. The backward direction is of course trivial. For the forward direction the following lemma gives a tool to transform an almost d-ary sequence of partitions into a d-ary one.

Lemma 6.1.5. Let Γ ∈ Γbd. Then there is a sequence Γ = Γ0 ≥ Γ1 ≥ · · · ≥ Γ` = {id} such that

|Γi−1 :Γi| ≤ d for all i ∈ [`]. 92 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

0 0 0 Proof. Let Γ = Γ0 D Γ1 D ··· D Γk = {id} be a composition series for Γ. Since Γ ∈ Γbd 0 0 every composition factor ∆i := Γi−1/Γi is isomorphic to a subgroup of Sd. Therefore, there is a sequence of subgroups

∆i = ∆i,0 ≥ ∆i,1 ≥ · · · ≥ ∆i,`i = {id}, i ∈ [k], such that |∆i,j−1 : ∆i,j| ≤ d S 0 for all j ∈ [`i]. Now let Γi,j := 0 Γ δ. Then Γiδ∈∆i,j i

Γ = Γ1,0 ≥ Γ1,1 · · · ≥ Γ1,`1 = Γ1 = Γ2,0 ≥ · · · ≥ Γ2,`2 = Γ2 = Γ3,0 ≥ · · · ≥ Γk,`k = {id} and |Γi,j−1 :Γi,j| ≤ d for all i ∈ [k] and j ∈ [`i].

Lemma 6.1.6. Let Γ ≤ Sym(Ω) be a Γbd-group and suppose Γ has an almost d-ary sequence of invariant partitions. Then Γ has a d-ary sequence of partitions.

Proof. Suppose Γ is transitive and let B0 · · · Bk be an almost d-ary sequence of partitions of Γ of maximal length k. Suppose towards a contradiction there is some i ∈ [k] and B ∈ Bi−1 such that |Bi[B]| > d. Then ΓB[Bi[B]] is regular. This means, ∆ := ΓB[Bi[B]] is permutationally equivalent to ∆ acting on itself by right-multiplication. Also, ∆ ∈ Γbd by Lemma 5.1.8. Hence, by Lemma 6.1.5, there is a sequence ∆ = ∆0 ≥ ∆1 ≥ · · · ≥ ∆` = {id} such that |∆i−1 : ∆i| ≤ d for all i ∈ [`]. Each subgroup ∆i is a block of the action of ∆ on itself by right-multiplication. This means there is a d-ary sequence of invariant partitions C0 · · · C` for the group ∆. Each of those partitions can be lifted to a block system ( !γ ) [ ∗ : Ci = B Ci ∈ Ci, γ ∈ Γ B∈Ci

∗ ∗ of the group Γ. Then B0 · · · Bi−1 = C0 · · · C` = Bi · · · Bk forms an almost d-ary sequence of partitions of Γ contradicting the maximality of k. Finally, if Γ is not transitive, take can arbitrary d-ary tree on the orbits and attach a d-ary structure tree for each orbit.

Since this chapter is only concerned with Γbd-group this lemma raises the question why one even wants to consider almost d-ary sequences of partitions. The reason for this is algorithmic. While every Γbd-group Γ that has an almost d-ary sequence of partitions also has a d-ary sequence of invariant partitions, it is not easy to compute the latter object given the first one. Using similar arguments as for the membership test of Γbd-groups (see Corollary 5.1.12) this can be done in time f(d)nc for some function f and a constant c. However, when n is only slightly larger than d this running time is not sufficient for our purposes. Indeed, for the algorithms we build later for the normalized action, it is not sufficient that the input group admits an almost d-ary structure tree, but such a tree needs to be given to the algorithms along with the input group.

6.1.2 Structure Graphs and Tree Unfoldings A key tool to normalize the action of the input group to the String Isomorphism Problem are tree unfoldings of certain directed graphs. Indeed, in order to obtain an action of the input group that has a structure tree with the desired properties, the basic idea is to first construct a structure graph for the group and afterwards unfold the graph leading to the normalized action for which the unfolded graph forms a structure tree. 6.1. STRUCTURE TREES 93

For a directed graph G define Ge to be the underlying undirected graph. A rooted simple acyclic graph is a pair (G, v0) where G is a directed graph and v0 ∈ V (G) such that for every (v, w) ∈ E(G) it holds that dist (v , v) + 1 = dist (v , w). Ge 0 Ge 0 Let G be a rooted simple acyclic graph. For v ∈ V (G) define N +(v) := {w ∈ V (G) | (v, w) ∈ E(G)} to be the set of outgoing neighbors of v. The forward degree of v is deg+(v) := |N +(v)|. A vertex is a leaf of G if it has no outgoing neighbors, i.e., deg+(v) = 0. Let L(G) = {v ∈ V (G) | deg+(v) = 0} denote the set of leaves of G. Definition 6.1.7 (Structure Graph). Let Γ ≤ Sym(Ω) be a permutation group. A structure graph for Γ is a triple (G, v0, ϕ) where (G, v0) is a rooted simple acyclic graph such that L(G) = Ω and Γ ≤ (Aut(G))[Ω] and ϕ:Γ → Aut(G) is a homomorphism such that (γϕ)[Ω] = γ for all γ ∈ Γ. Note that each structure tree can be viewed as a structure graph (for trees the homomorphism ϕ is uniquely defined and can be easily computed). As indicated above the strategy to normalize the action is to consider the tree unfolding of a suitable structure graph. The permutation domain of the normalized action then corresponds to the leaves of the tree unfolding for which there is a natural action of the group Γ. Let (G, v0) be a rooted simple acyclic graph. A branch of (G, v0) is a sequence (v0, v1, . . . , vk) such that (vi−1, vi) ∈ E(G) and distG(v0, vi) = i for all i ∈ [k]. A branch (v0, v1, . . . , vk) is maximal if it can not to extended to a longer branch, i.e., if vk is a leaf of (G, v0). Let Br(G, v0) ∗ denote the set of branches of (G, v0) and Br (G, v0) denote the set of maximal branches. Note ∗ that Br (G, v0) ⊆ Br(G, v0). Also, for a maximal branchv ¯ = (v0, v1, . . . , vk) let L(¯v) := vk. Note that L(¯v) ∈ L(G). For a rooted simple acyclic graph (G, v0) the tree unfolding of (G, v0) is defined to be the rooted tree Unf(G, v0) with vertex set Br(G, v0) and edge set

E(Unf(G, v0)) = {((v0, . . . , vk), (v0, . . . , vk, vk+1)) | (v0, . . . , vk+1) ∈ Br(G, v0)}.

∗ Note that L(Unf(G, v0)) = Br (G, v0), i.e., the leaves of the tree unfolding of (G, v0) are exactly the maximal branches of (G, v0).

m (t) Example 6.1.8. Let m ≤ d and t ≤ 2 . Consider the group Γ = Am (see Example 5.4.2). [m] Then a structure graph (G, v0, ϕ) for Γ ≤ Sym( t ) can be constructed as follows. The vertices of the graph are all subsets of [m] of size at most t, i.e.,

[m] V (G) := = {X ⊆ [m] | |X| ≤ t}. ≤ t

Two vertices X and Y are connected by an edge if Y is the extension of X by a single element, i.e., E(G) := {(X,Y ) | X ⊆ Y ∧ |Y \ X| = 1}. [m] The root of the structure graph is v0 := ∅. Note that L(G) = t as desired. Intuitively speaking, (G, v0) corresponds to the first t+1 levels of the subset lattice of the set [m]. An example is given in Figure 6.1. (t) [m] For γ ∈ Am let γ be the element obtained from the natural action of γ on t . So (t) ϕ(γ(t)) γ Γ = {γ | γ ∈ Am}. Let ϕ:Γ → Aut(G) be defined by X = X where X ∈ V (G). Then (G, v0, ϕ) is a structure graph for Γ. Now consider the tree unfolding Unf(G, v0) of the graph (G, v0). A maximal branch v¯ ∈ ∗ Br (G, v0) is a sequence v¯ = (X0,X1,...,Xt) where Xi−1 ⊆ Xi and |Xi| = i for all i ∈ [t]. This gives an ordering of the elements in Xt. Indeed, it is not difficult to see that there is a one-to-one 94 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

{1} {2} {3} {4} {5}

{1, 2} {1, 3} {2, 3} {1, 4} {2, 4} {1, 5} {2, 5} {3, 4} {3, 5} {4, 5}

(2) Figure 6.1: A structure graph for the Johnson group A5

∗ hti correspondence between the maximal branches Br (G, v0) and the set [m] of ordered t-tuples over the set [m] with pairwise distinct elements. hti hti The group Am also acts naturally on the set [m] . Let Am be the permutation group obtained hti from this action. Now it can be observed that Unf(G, v0) gives a structure tree for Am . Also, hti (t) the degree of the permutation group Am is only slightly larger than the degree of Am . Indeed,

  log m  log m t log m [m] m m t log m t hti = ≥ ≥ 2 = m ≥ |[m] |. (6.1) t t t In order to generalize this example to obtain a normalization for all groups we first observe that a group Γ naturally acts on the set of maximal branches of a structure graph and additionally, the tree unfolding of the structure graph forms a structure tree for this action.

Lemma 6.1.9. Let Γ ≤ Sym(Ω) be a permutation group and let (G, v0, ϕ) be a structure graph ∗ for Γ. Then there is an action ψ :Γ → Sym(Br (G)) on the set of maximal branches of (G, v0) such that 1. L(¯vψ(γ)) = (L(¯v))γ for all v¯ ∈ Br∗(G) and γ ∈ Γ, and

ψ 2. Unf(G, v0) forms a structure tree for Γ .

Moreover, given the group Γ and the structure graph (G, v0, ϕ), the homomorphism ψ can be ∗ computed in time polynomial in | Br (G, v0)|. ∗ ∗ Proof. Let γ ∈ Γ and (v0, . . . , vk) ∈ Br (G). The action ψ :Γ → Sym(Br (G)) is defined via

ψ(γ) ϕ(γ) ϕ(γ) (v0, . . . , vk) = (v0 , . . . , vk ). (6.2) ψ(γ) Clearly, (v0, . . . , vk) is a maximal branch of (G, v0) since ϕ(γ) ∈ Aut(G, v0). Also,

ψ(γδ) ϕ(γδ) ϕ(γδ) (v0, . . . , vk) = (v1 , . . . , vk ) ϕ(γ) ϕ(γ) ψ(δ) = (v1 , . . . , vk ) ψ(δ)  ψ(γ) = (v1, . . . , vk) which implies that ψ is a homomorphism. Moreover,

ψ(γ) ϕ(γ) ϕ(γ) ϕ(γ) γ γ L((v1, . . . , vk) ) = L(v1 , . . . , vk ) = vk = vk = (L(v1, . . . , vk)) . 6.1. STRUCTURE TREES 95

ψ 0 It remains to show that Unf(G, v0) forms a structure tree for Γ . Let ψ :Γ → Sym(Br(G)) be defined via ψ0(γ) ϕ(γ) ϕ(γ) (v0, . . . , vk) = (v1 , . . . , vk ) 0 ∗ 0 for all branches (v1, . . . , vk) ∈ Br(G). Then (ψ (γ))[Br (G)] = ψ(γ) and ψ (γ) ∈ Aut(Unf(G, v0)) for all γ ∈ Γ.

Let Γ ≤ Sym(Ω) and let (G, v0, ϕ) be a structure graph. In the following we refer to the action defined in the lemma above as the standard action of Γ on the set of maximal branches ∗ Br (G, v0) (with respect to ϕ).

Lemma 6.1.10. Let Γ ≤ Sym(Ω) be a transitive group. Then there is a structure graph (G, v0, ϕ) for Γ such that

1. |V (G)| ≤ 1 + |Γ| + n,

∗ 2. | Br (G, v0)| ≤ |Γ| and | Br(G, v0)| ≤ 1 + 2|Γ|, and

∗ 3. the standard action of Γ on Br (G, v0) is regular. Moreover, given the group Γ, the structure graph can be computed in time polynomial in the size of G.

Proof. Let α0 ∈ Ω be an arbitrary element that is fixed for this proof. Define G to be the graph with V (G) = {v0}] Γ ] Ω and γ E(G) = {(v0, γ) | γ ∈ Γ} ∪ {(γ, α0 ) | γ ∈ Γ}. For δ ∈ Γ define δϕ : V (G) → V (G) with (δϕ)(α) = αδ for all α ∈ Ω and (δϕ)(γ) = γδ for ϕ all γ ∈ Γ. It is easy to verify that ϕ is a homomorphism and δ ∈ Aut(G, v0) for all δ ∈ Γ. Moreover, ∗ γ Br (G, v0) = {(v0, γ, α0 ) | γ ∈ Γ} ∗ which implies | Br (G, v0)| ≤ |Γ| and | Br(G, v0)| ≤ 1 + 2|Γ|. To prove that the standard action ∗ ϕ(δ) ϕ(δ) γ ϕ(δ) γ of Γ on Br (G, v0) is regular let δ ∈ Γ and suppose (v0 , γ , (α0 ) ) = (v0, γ, α0 ). Then ∗ γδ = γ and thus, δ = id. Hence, the standard action ψ of Γ on Br (G, v0) is semi-regular. Also, the standard action is transitive since

−1 0 γ ψ(γ γ )  0 γ0  (v0, γ, α0 ) = v0, γ , α0 for every γ, γ0 ∈ Γ.

6.1.3 Normalizing the Action

Recall that the aim of this section is to normalize the action of Γbd-groups. More precisely the goal is to present a reduction from the String Isomorphism Problem for Γbd-groups to the String Isomorphism Problem for Γbd-groups equipped with an almost d-ary structure tree. The main tool to achieve this reduction are tree unfoldings of suitable structure graphs. Let Γ ≤ Sym(Ω) be a transitive group. A structure graph (G, v0, ϕ) is called almost d-ary if the tree unfolding ∗ Unf(G, v0) builds an almost d-ary structure tree for the standard action of Γ on Br (G, v0). In order to prove a structure graph is almost d-ary we typically build on the following observation. 96 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Observation 6.1.11. Let (G, v0, ϕ) be a structure graph for Γ ≤ Sym(Ω) and suppose that, for every v ∈ V (G), it holds that 1. deg+(v) ≤ d, or

ϕ + 2. (Γ )v[N (v)] is semi-regular.

Then (G, v0, ϕ) is almost d-ary. An example of such a structure graph (that actually gives a d-ary structure tree) is given in Example 6.1.8. The next lemma describes the desired reduction assuming it is given an almost d-ary structure graph.

Lemma 6.1.12. Let Γ ≤ Sym(Ω) be a permutation group and let (G, v0, ϕ) be an almost d-ary structure graph for Γ. Also let x, y:Ω → Σ be two strings. Then there is an algorithm computing ∗ ∗ ∗ ∗ ∗ ∗ a group Γ ≤ Sym(Ω ), an almost d-ary structure tree (T, v0) for Γ , and strings x , y :Ω → Σ such that ∼ ∗ ∼ ∗ 1. x =Γ y if and only if x =Γ∗ y , 2. |Ω∗| ≤ | Br∗(G)|. Moreover, the algorithm runs in time polynomial in the size of the input and |Ω∗|.

∗ ∗ ∗ Proof. Let Ω := Br (G, v0) and let ψ :Γ → Sym(Br (G, v0)) be the standard action of Γ on ∗ ∗ ψ ∗ the set Br (G, v0). Let Γ := Γ . Then Unf(G, v0) is an almost d-ary structure tree for Γ by Lemma 6.1.9. Also, define x∗ :Ω∗ → Σ:v ¯ 7→ x(L(¯v)) and similarly y∗ :Ω∗ → Σ:v ¯ 7→ y(L(¯v)). ∼ γ ∗ First suppose x =Γ y, i.e., there is some γ ∈ Γ such that x = y. Then, for everyv ¯ ∈ Br (G, v0), it holds that

 −1    −1   −1  (x∗)ψ(γ)(¯v) = x∗ v¯ψ(γ) = x L v¯ψ(γ) = x L(¯v)γ = xγ (L(¯v)) = y(L(¯v)) = y∗(¯v) using Lemma 6.1.9. For the other direction suppose there is some γ ∈ Γ such that (x∗)ψ(γ) = y∗. ∗ Let α ∈ Ω and pick somev ¯ ∈ Br (G, v0) such that L(¯v) = α. Then

 −1    −1   −1  xγ (α) = x L(¯v)γ = x L v¯ψ(γ) = x∗ v¯ψ(γ) = (x∗)ψ(γ)(¯v) = y∗(¯v) = y(L(¯v)) = y(α) again using Lemma 6.1.9. Hence, in the following the aim is to compute structure graphs of small size and a small number of branches. The following lemma serves as an important recursive tool building structure graphs form smaller ones along a given block system. Lemma 6.1.13. Let Γ ≤ Sym(Ω) be a transitive group, B be a block system of size m := |B| and let B ∈ B be a block of size b := |B|. Suppose there is an almost d-ary structure graph k (GB, v0, ϕB) for Γ[B] such that | Br(GB, v0)| ≤ m . Also, suppose there is an almost d-ary ` structure graph (HB, w0, ϕB) for ΓB[B] such that | Br(HB, w0)| ≤ b . Then there is an almost d-ary structure graph (G, v0, ϕ) for Γ such that | Br(G, v0)| ≤ nmax{k,`}. Moreover, there is an algorithm computing such a structure graph in time polyno- mial in the size of G. 6.1. STRUCTURE TREES 97

Proof. The basic idea to construct the structure graph for Γ is a glue a copy of (HB, w0) to every leaf of (GB, v0). Towards this end suppose B = {B1,...,Bm} such that B = B1. For γ1→i every i ∈ [m] pick γ1→i ∈ Γ such that B1 = Bi. Also define the rooted simple acyclic graph i (HB, (w0, i)) with vertex set

i γ1→i V (HB) := {(v, i) | v ∈ V (HB) \ B} ∪ {α | α ∈ B} and edge set

i γ1→i E(HB) := {((v, i), (w, i)) | (v, w) ∈ E(HB), w∈ / B} ∪ {((v, i), α ) | (v, α) ∈ E(HB), α ∈ B}.

i i Note that (HB, (w0, i)) is an isomorphic copy of (HB, w0) with leaf set L(HB) = Bi. Let ∼ i ψi : HB = HB be the standard isomorphism defined by ψi(w) := (w, i) for all w ∈ V (HB) \ B γ1→i i ∼ j −1 and ψi(α) := α for α ∈ B. Also let ψi,j : HB = Hb be defined by ψi,j := ψi ψj. Moreover, let i i −1 −1  ϕB :ΓBi [Bi] → Aut(HB, (w0, i)): γ 7→ ψi ϕB(γ1→iγγ1→i) ψi −1 i i (where γ1→i and γ1→i are restricted in a suitable manner). Then (HB, (w0, i), ϕB) is an almost d-ary structure graph for ΓBi [Bi]. Now define the rooted simple acyclic graph (G, v0) where

[ i V (G) = V (GB) ] V (HB) i∈[m] and [ i E(G) = E(GB) ∪ E(HB) ∪ {(Bi, (w0, i)) | i ∈ [m]}. i∈[m]

Clearly, L(G) = Ω and the graph (G, v0) can be computed in time polynomial in the size of G. Also, every branchv ¯ ∈ Br(G, v0) is the concatenation of a branchu ¯ ∈ Br(GB, v0) and a (possibly i empty) branchw ¯ ∈ Br(HB, (w0, i)) for the unique i ∈ [m] such that L(¯u) = Bi. This means

k ` max{k,`} | Br(G, v0)| ≤ | Br(GB, v0)| · | Br(HB, w0)| ≤ m · b ≤ n .

Next define the homomorphism ϕ:Γ → Aut(G, v0). For γ ∈ Γ first let σ ∈ Sm such that γ Bi = Bσ(i) for all i ∈ [m]. Then, for each i ∈ [m], there is a δi ∈ ΓBi [Bi] such that γ[Bi] = −1 δiγi→σ(i) where γi→j := γ1→iγ1→j. With this, for v ∈ V (G), define ( ϕB(γ[B]) ϕ(γ) v if v ∈ V (GB) v = i . ϕB (δi) i ψi→σ(i)(v ) if v ∈ V (HB)

It can be checked that (G, v0, ϕ) is a structure graph for Γ. It remains to check that (G, v0, ϕ) is + almost d-ary. Towards this end, let vt ∈ V (G) such that deg (vt) > d. Also let (v0, v1, . . . , vt) ∈ ϕ + Br(G, v0) be a branch of (G, v0) that terminates in vt. It suffices to argue (Γ )(v0,v1,...,vt)[N (vt)] is semi-regular. If v ∈ V (GB) this immediately follows from the fact that (GB, v0, ϕB) is almost i d-ary. Otherwise v ∈ V (HB) for some i ∈ [m] and the statement follows from the fact that i i (HB, (w0, i), ϕB) is almost d-ary. Remark 6.1.14. Note that the techniques from the proof of the last lemma can be applied independent of the given structure graphs being almost d-ary. Indeed, this property is only required to prove that the structure graph (G, v0, ϕ) is also almost d-ary. Additionally, observe that when both input structure graphs are d-ary then the output (G, v0, ϕ) is also a d-ary structure graph. 98 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

The last lemma suggests the crucial step is to construct structure graphs for primitive groups. For primitive Γbd-groups this can be done based on the Characterization Theorem for primitive Γbd-groups presented in the previous chapter (Theorem 5.4.16). As Johnson groups play a vital role in the characterization, first an auxiliary lemma about such actions is required. This lemma is implicitly proved in [22, Section 4].

(t) Lemma 6.1.15 ([22]). Let Γ ≤ Sym(Ω) and suppose Γ is permutationally equivalent to Am (t) m [m] or Sm for m > 4 log n and t ≤ 2 . Then a permutational isomorphism ρ:Ω → t can be computed in polynomial time.

Lemma 6.1.16. Let Γ ≤ Sym(Ω) be a primitive Γbd-group. Then there is an almost d-ary O(log d) structure graph (G, v0, ϕ) for Γ such that | Br(G, v0)| ≤ n . Moreover, there is an algorithm computing such a structure graph in time polynomial in the size of G.

c1 log d+c2 Proof. First suppose |Γ| ≤ n where c1, c2 are the absolute constants from Theorem 5.4.16. Then the statement follows from Lemma 6.1.10. Observe that the structure graph defined in the proof of Lemma 6.1.10 is almost d-ary for every d ≥ 1. For the other case let N := Soc(Γ). By Theorem 5.4.16, there is a sequence of partitions {Ω} = B0 · · · Bk = {{α} | α ∈ Ω} such that

(a) |Γ: N| ≤ n1+log d,

(b) Bi is N-invariant for all i ∈ [k], and

m m (c) there are m ≤ d and t ≤ 2 with m > 4 log s where s := t such that for all i ∈ [k] and (t) B ∈ Bi−1 the group NB[Bi[B]] is permutationally equivalent to Am .

The first step is to construct a d-ary structure graph (H, w0, ψ) for N such that | Br(H, w0)| ≤ n1+log d. This can be done by induction on k as follows. For the base step k = 0 it holds that n = 1 for which this is trivial. So suppose k ≥ 1 and let B := B1. Then N[B] is permutationally (t) [m] equivalent to Am and a permutational isomorphism ρ: B → t can be computed in polynomial time by Lemma 6.1.15. So a d-ary structure graph (HB, w0, ψB) for N[B] can be computed using ∗ log m 1+log d Example 6.1.8. Moreover, | Br(HB, w0)| ≤ 2·| Br (HB, w0)| ≤ 2·s ≤ s using Equation (6.1). Also, by the induction hypothesis, there is a d-ary structure graph (HB, u0, ψB) for NB[B] 1+log d such that | Br(HB, u0)| ≤ (n/s) for every B ∈ B. So the structure graph for N can be obtained from Lemma 6.1.13. Observe that the resulting structure graph is d-ary by Remark 6.1.14. Let T = {δ1 = id, . . . , δt} be a transversal for N in Γ. Now define (G, v0) to be the directed graph with

V (G) := {v0} ∪ (V (H) × T ) ∪ Ω and

E(G) := {(v0, (w0, δ)) | δ ∈ T } ∪ {((v, δ), (w, δ)) | (v, w) ∈ E(H), δ ∈ T } ∪ {((αδ, δ), α) | α ∈ Ω, δ ∈ T }.

Then

∗ 2·(1+log d) O(log d) | Br(G, v0)| ≤ 1 + |T | · | Br(H, w0)| + |T | · | Br (H, w0)| ≤ 1 + 2 · n = n . 6.1. STRUCTURE TREES 99

In order to define a homomorphism ϕ:Γ → Aut(G, v0) pick γ ∈ Γ and let λ ∈ N and δ ∈ T be the unique elements such that γ = λδ. Now define σ ∈ Sym(V (G)) where ασ = αγ for all α ∈ Ω, σ v0 = v0, and 0 −1 00 (v, δ0)σ = (vψ((δ ) λδδ ), δ00) for v ∈ V (H) and δ0 ∈ T where δ00 ∈ T is the unique element such that δ−1δ0 ∈ Nδ00. Observe that there is some λ0 ∈ N such that δ−1δ0λ0 = δ00 and therefore, (δ0)−1λδδ00 = (δ0)−1λδ0λ0 ∈ N. Also note that

0 0 0 −1 00 00 ((αδ , δ0)σ, ασ) = ((αδ (δ ) λδδ , δ00), αλδ) = ((αλδδ , δ00), αλδ) ∈ E(G)

0 for all α ∈ Ω and δ ∈ T . This implies σ ∈ Aut(G, v0). Now define ϕ(γ) := σ. To argue that ϕ:Γ → Aut(G, v0) is a homomorphism let γ1, γ2, γ3 ∈ Γ such that γ1γ2 = γ3. For i ∈ [3] let λi ∈ N and δi ∈ T be the unique elements such that γi = λiδi. Let σi = ϕ(γi). Also 0 00 00 00 −1 0 00 −1 00 00 −1 0 00 let v ∈ V (H), δ ∈ T and δ1 , δ2 , δ3 ∈ T such that δ1 δ ∈ Nδ1 , δ2 δ1 ∈ Nδ2 , and δ3 δ ∈ Nδ3 . 00 00 First observe that δ2 = δ3 because γ1γ2 = γ3. Then

σ  0 −1 00 σ2 0 σ1 2 ψ((δ ) λ1δ1δ1 ) 00 ((v, δ ) ) = v , δ1

 0 −1 00 00 −1 00  ψ((δ ) λ1δ1δ1 )ψ((δ1 ) λ2δ2δ2 ) 00 = v , δ2

 0 −1 00  ψ((δ ) λ1δ1λ2δ2δ2 ) 00 = v , δ2

 0 −1 00  ψ((δ ) λ3δ3δ3 ) 00 = v , δ3 = (v, δ0)σ3 .

It follows that σ1σ2 = σ3 which implies ϕ is a homomorphism. To complete the proof it only remains to argue that (G, v0, ϕ) is almost d-ary. Let v ∈ V (G) + + ϕ + such that deg (v) > d. Then v = v0 and N (v0) = {(w0, δ) | δ ∈ T }. Moreover, Γ [N (v0)] is permutationally equivalent to the action of Γ on Γ/N by right-multiplication. This action is regular.

Now the structure graphs for primitive groups can be combined into a structure graph for every transitive Γbd-group using Lemma 6.1.13.

Lemma 6.1.17. Let Γ ≤ Sym(Ω) be a transitive Γbd-group. Then there is an almost d-ary O(log d) structure graph (G, v0, ϕ) for Γ such that | Br(G, v0)| ≤ n . Moreover, there is an algorithm computing such a structure graph in time polynomial in the size of G.

Proof. The lemma is proved by induction on n = |Ω|. Let c1, c2 be sufficiently large absolute constants. The base case n ≤ d is trivial. For the inductive step let Γ ≤ Sym(Ω) be a transitive Γbd-group of degree n. Let B be a minimal block system for Γ and let b := |B| where B ∈ B. Also let m = |B|. Observe that n = m · b. The group Γ[B] is primitive. So there is an almost d-ary c1+c2 log d structure graph (GB, v0, ϕB) for Γ[B] such that | Br(GB, v0)| ≤ m by Lemma 6.1.16. Now fix a block B ∈ B. Consider the group ΓB[B] which is a transitive Γbd-group of degree b. By the induction hypothesis there is an almost d-ary structure graph (HB, w0, ϕB) for ΓB[B] c1+c2 log d such that | Br(HB, w0)| ≤ b . So there is an almost d-ary structure graph (G, v0, ϕ) for c1+c2 log d Γ such that | Br(G, v0)| ≤ n by Lemma 6.1.13.

With this, we obtain the desired normalization of the action. 100 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Corollary 6.1.18. There is a Turing-reduction from the String Isomorphism Problem for Γbd- groups to the String Isomorphism Problem for groups equipped with an almost d-ary structure tree running in time nO(log d).

Proof. Let Γ ∈ Γbd be the input group. Using orbit-by-orbit processing it can be assumed that Γ is transitive. Then the statement follows from Lemma 6.1.17 and 6.1.12. Remark 6.1.19. Compared to the normalization procedure in the original paper [69] presenting a faster isomorphism test for bounded degree graphs, the methods given in this thesis are far less technical building mostly on the concept of tree unfoldings. Moreover, in [69] the analysis of the 2 normalization only yields an upper bound of nO((log d) ) on the size of the normalized instances. Maybe surprisingly, the less technically involved analysis of this thesis provides the better bound of nO(log d). Hence, in the following we shall be concerned with the String Isomorphism Problem for groups equipped with an almost d-ary structure tree. The main advantage of such groups is that there they admit a natural generalization of the Unaffected Stabilizers Theorem (see Theorem 5.5.8).

6.2 Affected Orbits

The basis of Babai’s Local Certificates algorithm is a group-theoretic statement, the Unaffected Stabilizers Theorem (see Theorem 5.5.8). In the following this theorem is generalized to the nor- malized setting described in the last section. The proof roughly follows the same argumentation as in [10].

Lemma 6.2.1 (cf. [10, 114]). Let Γ ≤ K1 × · · · × K` be a subdirect product and let ϕ:Γ → S be an epimorphism where S is a non-abelian simple group. Furthermore let πi :Γ → Ki be ∗ the projection to the i-th component and Mi = ker(πi). Then there is some i ∈ [`] such that Mi∗ ≤ ker(ϕ).

Lemma 6.2.2. Let Γ be a group, ∆, Λ E Γ and suppose ϕ:Γ → S is an epimorphism where S is a non-abelian simple group. Furthermore suppose that ∆ϕ = Λϕ = S. Then (∆ ∩ Λ)ϕ = S.

ϕ Proof. Let N = ker(ϕ). Suppose that (∆ ∩ Λ) 6= S. Since ∆ ∩ Λ E Γ and S is a simple group it follows that (∆ ∩ Λ)ϕ = {id}, that is, ∆ ∩ Λ ≤ N. Now let σ1, σ2 ∈ S be two arbitrary elements. Then there are δ ∈ ∆, λ ∈ Λ such that −1 −1 ϕ(δ) = σ1 and ϕ(λ) = σ2. Moreover, η := δ λ δλ ∈ ∆ ∩ Λ ≤ N since ∆ E Γ and Λ E Γ. Note that δλ = λδη. But then

σ1σ2 = ϕ(δ)ϕ(λ) = ϕ(δλ) = ϕ(λδη) = ϕ(λ)ϕ(δ)ϕ(η) = σ2σ1.

Since σ1, σ2 ∈ S were chosen arbitrarily it follows that S is Abelian.

Lemma 6.2.3 ([10], Lemma 8.3.1). Let Γ ≤ Sd be a transitive group and ϕ:Γ → Ak an ϕ epimorphism where k > max{8, 2 + log2 d}. Then Γα 6= Ak for all α ∈ [d]. Lemma 6.2.4. Let Γ ≤ Sym(Ω) be a transitive group and suppose there is an almost d-ary sequence of invariant partitions {Ω} = B0 · · · Bm = {{α} | α ∈ Ω}. Furthermore let ϕ k > max{8, 2 + log2 d}, and let ϕ:Γ → Ak be an epimorphism. Then Γα 6= Ak for all α ∈ Ω.

Proof. The statement is proved by induction on the cardinality of Γ. Let K := Γ(B1) = {γ ∈ γ Γ | ∀B ∈ B1 : B = B} be the normal subgroup stabilizing the block system B1. Also let N := ker(ϕ). 6.2. AFFECTED ORBITS 101

ψ First suppose K ≤ N. Then ϕ factors across Γ → Γ[B1] → Ak. Observe that ψ is an epimorphism since ϕ is an epimorphism. Since B0 · · · Bm is an almost d-ary sequence of partitions |B1| ≤ d or Γ[B1] is semi-regular. First suppose |B1| ≤ d. Then, by Lemma ψ ϕ ϕ 6.2.3, for every B ∈ B1 it holds that (Γ[B1])B 6= Ak. Hence, Γα ≤ ΓB 6= Ak where B ∈ B1 is the unique set such that α ∈ B. In the other case Γ[B1] is semi-regular and consequently, ψ ψ ϕ ϕ (Γ[B1])B = {id} = {id} 6= Ak for all B ∈ B1. Again, Γα ≤ ΓB 6= Ak where B ∈ B1 is the unique set such that α ∈ B. ϕ ϕ Otherwise K 6≤ N and thus K 6= {id}. Since K E Ak and Ak is a simple group it follows ϕ ϕ K = Ak. Suppose towards a contradiction that there is some α ∈ Ω such that Γα = Ak. Pick ϕ B ∈ B1 such that α ∈ B. In particular, ΓB = Ak. ϕ Claim 1. Γ(B) 6= Ak. ϕ ϕ Proof. Assume towards a contradiction that Γ(B) = Ak. Then, by Lemma 6.2.2, K(B) = ϕ ϕ (Γ(B) ∩ K) = Ak since Γ(B) E ΓB, K E ΓB and K = Ak. On the other hand, let Ω1,..., Ω` be the orbits of K. Let πi : K → Sym(Ωi) be the restriction of K to Ωi, Ki = im(πi) and Mi = ker(πi). By Lemma 6.2.1 there is some i ∈ [`] such that Mi ≤ N. Since Γ acts transitively on the blocks {Ω1,..., Ω`} the groups Mi, i ∈ [`], are conjugate γ −1 subgroups in Γ, i.e., for all i, j ∈ [`] there is some γ ∈ Γ such that Mi := γ Miγ = Mj. Since ∗ N E Γ this implies Mi ≤ N for all i ∈ [`]. Pick i ∈ [`] such that α ∈ Ωi∗ . Since Mi∗ ≤ N the πi∗ ψ ψ epimorphism ϕ|K : K → Ak factors across Ki∗ as K → Ki∗ → Ak. Hence, Ki∗ = Ak. Moreover, B1[Ωi∗ ] · · · Bm[Ωi∗ ] is an almost d-ary sequence of partitions for Ki∗ . By the induction ψ ϕ hypothesis it follows that (Ki∗ )α 6= Ak and thus, Kα 6= Ak. But this is a contradiction since ϕ ϕ K(B) ≤ Kα . y

ϕ ϕ ϕ ψ Since Γ(B) E ΓB it follows Γ(B) = {id}. So ϕ|ΓB factors across ΓB → ΓB[B] → Ak. 0 ψ 0 Moreover, ϕ|Γα factors across Γα → Γα[B] → Ak, where ψ = ψ|Γα[B]. Overall this means ψ ψ ψ0 (ΓB[B]) = Ak and (ΓB[B])α = (Γα[B]) = Ak. But this contradicts the induction hypothesis since B1[B] · · · Bm[B] is an almost d-ary sequence of ΓB[B]-invariant partitions and ΓB[B] is transitive. Remark 6.2.5. The proof of the last lemma actually proves a more general statement. More precisely, the proof reveals that if the statement of the lemma holds for the groups ΓB[Bi[B]] for all B ∈ Bi−1 and i ∈ [m] then it also holds for the complete group Γ. Indeed, the proof only requires any specific properties of Γ in the first case K ≤ N. Specifically, the proof uses the fact that the statement of the lemma holds for subgroups of Sd and (trivially) for semi-regular permutation groups. Since any sequence of invariant partitions can be extended to a maximal sequence of invariant partitions this actually implies that it also suffices to consider only the primitive groups appearing along a given sequence of partitions (in this sense our proof methods are actually stronger than the ones employed by Babai [10, Lemma 8.3.1] for which such a statement is not true). Since primitive groups are quite restricted this suggests that even more general statements than the one presented in the lemma above might be possible. Note that an example of a primitive group for which Lemma 6.2.4 does not hold is given in Example 5.5.10. The following lemma shows that the assumption of Γ being transitive can be dropped if one ϕ is only looking for some element α ∈ Ω such that Γα 6= Ak. Lemma 6.2.6. Let Γ ≤ Sym(Ω) be a group and suppose there is an almost d-ary sequence of Γ-invariant partitions {Ω} = B0 · · · Bm = {{α} | α ∈ Ω}. Furthermore let k > ϕ max{8, 2 + log2 d}, and let ϕ:Γ → Ak be an epimorphism. Then Γα 6= Ak for some α ∈ Ω. 102 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Proof. Let Ω1,..., Ω` be the orbits of Γ and let πi :Γ → Sym(Ωi) be the restriction of Γ to Ωi. Let Γi := Γ[Ωi] = im(πi) and Mi := ker(πi). By Lemma 6.2.1 there is some i ∈ [`] such that πi ψ ψ Mi ≤ ker(ϕ). So ϕ factors across Γi as Γ → Γi → Ak. It follows that Γi = Ak. Now let α ∈ Ωi. Note that B0[Ωi]  · · ·  Bm[Ωi] forms an almost d-ary sequence of Γi-invariant partitions. ψ ϕ Thus, by Lemma 6.2.4, it follows that (Γi)α 6= Ak and hence, Γα 6= Ak. With this we are ready to prove the generalization of the Unaffected Stabilizers Theorem (see Theorem 5.5.8). Towards this end, recall the definition of giant representations and affected points (see Definition 5.5.6). Theorem 6.2.7. Let Γ ≤ Sym(Ω) be a permutation group and suppose there is an almost d-ary sequence of Γ-invariant partitions {Ω} = B0 · · · Bm = {{α} | α ∈ Ω}. Furthermore let k > max{8, 2 + log2 d} and ϕ:Γ → Sk be a giant representation. Let D ⊆ Ω be the set of ϕ elements not affected by ϕ. Then Γ(D) ≥ Ak. ϕ Proof. First suppose that Γ = Ak. The set D is Γ-invariant (cf. Remark 5.5.7). Let ψ :Γ → Sym(D) be the restriction of Γ to D. Observe that ker(ψ) = Γ(D). So Γ(D) E Γ and hence, ϕ ϕ ϕ ϕ Γ(D) E Γ = Ak. Assume towards a contradiction that Γ(D) 6= Ak. Then Γ(D) = {id}, that is, ψ ψ ρ Γ(D) ≤ ker(ϕ). So ϕ factors across ∆ := Γ = Γ[D] ≤ Sym(D) as Γ → ∆ → Ak. Note that B0[D]  · · ·  Bm[D] forms an almost d-ary sequence of ∆-invariant partitions. It follows that ρ ρ ϕ ρ ∆ = Ak and hence, ∆α 6= Ak for some α ∈ D by Lemma 6.2.6. But Γα = ∆α = Ak since α ∈ D is not affected, which is a contradiction. ϕ 0 −1 0 0 So consider the case that Γ = Sk and let Γ = ϕ (Ak). Let ϕ = ϕ|Γ0 . Let D be the set of 0 0 0 ϕ 0 ϕ points not affected by ϕ . First it is argued that D = D. Clearly D ⊆ D because Γα ≥ (Γα) 0 ϕ 0 ϕ for all α ∈ Ω. So suppose there exists some α ∈ D \ D . Then Γα ≥ Ak, (Γα) < Ak and ϕ 0 ϕ |Γα : (Γα) | ≤ 2. Overall, this gives us a subgroup of Ak of index 2. But such a subgroup 0 would be a normal subgroup contradicting the fact that Ak is simple. So D = D. Then, by the 0 0 ϕ 0 ϕ 0 ϕ previous case, Γ(D) ≥ (Γ )(D) = (Γ )(D0) ≥ Ak.

6.3 Recursion

Before describing how the Unaffected Stabilizers Theorem for groups that have an almost d-ary structure tree can be exploited algorithmically, we first discuss the recursion that we are aim- ing for to achieve the desired running time. With the normalization of the action described in Corollary 6.1.18 the goal is to solve the String Isomorphism Problem for groups that have an almost d-ary structure tree in time npolylog(d). For this, we follow the basic recursive framework of Luks’s algorithm (see Algorithm 4) and Babai’s quasipolynomial time isomorphism test. In this section we formulate the recursion formulas used to analyze the running time of our recursive algorithm. In particular, these formulas specify how many recursive calls to the String Isomor- phism Problem over a certain domain size an algorithm can perform while keeping the desired bound on its running time. P` k Lemma 6.3.1. Let k, n ∈ N and suppose n1, . . . , n` ≤ n/2 such that i=1 ni ≤ 2 n. Then k+1 P` ni  i=1 n ≤ 1.

ni 1 P` k Proof. For i ∈ [`] define αi := n . Observe that αi ≤ 2 and i=1 αi ≤ 2 . Now suppose towards a contradiction that there are ` ∈ N and nonnegative reals α1, . . . , α` ∈ R meeting these P` k+1 assumptions such that i=1 αi > 1. Pick ` ∈ N, α1, . . . , α` ∈ R such that 1 P` k (i) αi ≤ 2 for all i ∈ [`] and i=1 αi ≤ 2 , 6.3. RECURSION 103

P` k+1 (ii) i=1 αi > 1, (iii) ` is minimal subject to Conditions (i) and (ii),

1 (iv) |{i ∈ [`] | αi = 2 }| is maximal subject to Conditions (i) - (iii). 1 1 Then αi + αj > 2 for all i, j ∈ [`]. Let A = {i ∈ [`] | αi 6= 2 } and suppose |A| ≥ 2. Let i, j ∈ A be distinct. Then 1k+1  1k+1 αk+1 + αk+1 ≤ + α + α − i j 2 i j 2 1 which contradicts Condition (iv). Condition (iii) implies αi > 0 for all i. Hence, (` − 1) 2 < P` k k+1 P` k+1 1 k+1 i=1 αi ≤ 2 , which implies ` ≤ 2 . Therefore, i=1 αi ≤ ` 2 ≤ 1, contradicting Condition (ii).

Lemma 6.3.2. Let k ∈ N and t: N → N be a function such that t(1) = 1. Suppose that for every n ≥ 2 there are natural numbers n1, . . . , n` for which one of the following holds:

P` P` k (A) t(n) ≤ i=1 t(ni) where i=1 ni ≤ 2 n and ni ≤ n/2 for all i ∈ [`], or

P` k n (B) t(n) ≤ i=1 ` t(ni) where ` ≥ 2, `|n and ni = ` for all i ∈ [`], or P` P` (C) t(n) ≤ i=1 t(ni) where i=1 ni ≤ n and ` ≥ 2. Then t(n) ≤ nk+1.

Proof. The statement is proved by induction on n ∈ N. The base step is clear from the assump- tions of the lemma. So suppose n ≥ 2. For the first option it holds that

` ` ` k+1 X I.H. X X ni  t(n) ≤ t(n ) ≤ nk+1 = nk+1 ≤ nk+1 i i n i=1 i=1 i=1 by Lemma 6.3.1. For the second case we have

` ` X n I.H. X nk+1 t(n) ≤ `k · t ≤ `k = nk+1. ` ` i=1 i=1 Finally, for the last case it holds that

` ` ` !k+1 I.H. X X k+1 X k+1 t(n) ≤ t(ni) ≤ ni ≤ ni ≤ n . i=1 i=1 i=1

Observe that the recurrences described in Condition (B) and (C) are exactly the ones oc- curring in Luks’s algorithm (see Corollary 5.4.18). The first condition gives another type of recurrence that is used to analyze the methods based on the Local Certificates Routine. In this situation the domain sizes of the problems in the recursion might vary drastically. It turns out that, contrary to the analysis of Babai’s algorithm, it is not sufficient in this case to apply Condition (B) in order to obtain the desired overall bound on the running time. Hence, the first condition adds a crucial recurrence for a more precise analysis of the Local Certificates Routine which is required for proving the desired bound on the running time of our algorithm. 104 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

6.4 Local Certificates

Having generalized Babai’s Unaffected Stabilizers Theorem and described the desired recursion the next step is to adapt the Local Certificates Routine (see [11]). Let Γ ≤ Sym(Ω) be a group and suppose there is an almost d-ary structure tree for Γ. Also let x, y:Ω → Σ be two strings and let ϕ:Γ → Sk be a giant representation. The goal of the Local Certificates Routine is essentially to achieve one of the following two options making only a few recursive calls to the String Isomorphism Problem over a significantly smaller domain size. The first option is to compute isomorphism-invariant relational structures Ai, i ∈ {1, 2}, defined on the set [k] such that the automorphism group Aut(A1) is significantly smaller than the symmetric group Sk. By computing the isomorphisms between A1 and A2 this allows the algorithm to significantly reduce the size of the group Γ. However, this is not always possible. Indeed, ϕ (AutΓ(x1)) ≤ Aut(A1) since A1 is supposed to be defined in an isomorphism-invariant way. So ϕ if (AutΓ(x1)) ≥ Ak it is impossible to find such structures. Consequently, the second option is ϕ to find many automorphisms ∆ ≤ AutΓ(x) such that ∆ forms a giant on a large subset of [k]. This turns the Local Certificates Routine into a powerful subroutine. It can be used to decide whether the set of Γ-isomorphisms from x to y generates a giant group on the set [k] and moreover, in both the positive and the negative case, it produces a meaningful certificate.

6.4.1 The Algorithm Let Γ ≤ Sym(Ω) be a permutation group and let x:Ω → Σ be a string. Furthermore let ϕ:Γ → Sk be a giant representation. For the description of the Local Certificates Routine we extend the notation of set- and point-wise stabilizers for the group Γ to the action on the −1 ϕ set [k] defined via the giant representation ϕ. For a set T ⊆ [k] let ΓT = ϕ ((Γ )T ) and −1 ϕ Γ(T ) = ϕ ((Γ )(T )). The basic approach of the Local Certificates Routine is to consider test sets T ⊆ [k] of logarithmic size.

ϕ Definition 6.4.1. A test set T ⊆ [k] is full if (AutΓT (x)) [T ] ≥ Alt(T ). A certificate of fullness is ϕ a subgroup ∆ ≤ AutΓT (x) such that ∆ [T ] ≥ Alt(T ). A certificate of non-fullness is a non-giant ϕ Λ ≤ Sym(T ) such that (AutΓT (x)) [T ] ≤ Λ. The central part of the algorithm is to determine for each test set T ⊆ [k] (of a certain size t = |T | to be determined later) whether T is full and, depending on the outcome, compute a certificate of fullness or a certificate of non-fullness. This is achieved by the following lemma which crucially depends on our variant of the Unaffected Stabilizers Theorem (Theorem 6.2.7). W Let W ⊆ Ω be Γ-invariant and let y:Ω → Σ be a second string. Recall that IsoΓ (x, y) = γ W W {γ ∈ Γ | ∀α ∈ W : x(α) = y(α )} and AutΓ (x) = IsoΓ (x, x). ϕ For ∆ ≤ Γ define Aff(∆, ϕ) = {α ∈ Ω | ∆α 6≥ Ak} to be the set of points affected by ϕ for the group ∆. Note that for ∆1 ≤ ∆2 ≤ Γ it holds Aff(∆1, ϕ) ⊇ Aff(∆2, ϕ). Finally, remember n always denotes the size of the permutation domain Ω. Lemma 6.4.2. Let x:Ω → Σ be a string and Γ ≤ Sym(Ω) a group that has an almost d-ary structure tree. Furthermore suppose there is a giant representation ϕ:Γ → Sk and let T ⊆ [k] be a set of size |T | =: t > max{8, 2 + log2 d}. P` Then there are natural numbers n1, . . . , n` ≤ n/2 such that i=1 ni ≤ n and, for each i ∈ [`] using at most t! recursive calls to the String Isomorphism Problem over domain size ni, and O(t! · nc) additional computation, one can decide whether T is full and generate a corresponding certificate. 6.4. LOCAL CERTIFICATES 105

Algorithm 5: LocalCertificates Input : A string x:Ω → Σ, a group Γ ≤ Sym(Ω) that has an almost d-ary structure tree, and ϕ:Γ → Sk with k > max{8, 2 + log2 d}. ϕ ϕ Output: A non-giant Λ ≤ Sk with (AutΓ(x)) ≤ Λ or ∆ ≤ AutΓ(x) with ∆ ≥ Ak.

1 Γ0 := Γ 2 W0 := ∅ 3 i := 0 ϕ 4 while Γi ≥ Ak and Wi 6= Aff(Γi, ϕ) do 5 Wi+1 := Aff(Γi, ϕ) ∗ 6 Wi+1 := Wi+1 \ Wi ∗ 1 7 if |Wi+1| ≤ |Ω| then 2 ∗ Wi+1 8 Γ := Aut (x) i+1 Γi 9 else 10 Γi+1 := ∅ ϕ 11 N := {γ ∈ Γi | γ = id} ϕ 12 for δ ∈ Γi do 13 compute γ ∈ Γi with ϕ(γ) = δ ∗ Wi+1 14 Γi+1 := Γi+1 ∪ AutNγ (x) 15 end 16 end 17 i := i + 1 18 end ϕ 19 if Γi 6≥ Ak then ϕ 20 return Γi 21 else

22 return (Γi)(Ω\Wi) 23 end 106 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Proof. Without loss of generality suppose T = [k]. Otherwise, compute the group ΓT and restrict the image of ϕ to the set T . Consider Algorithm 5. In each iteration it first computes an updated window Wi+1 := Aff(Γ , ϕ) and then updates the group Γ := AutWi+1 (x) ≤ Γ . By induction it is easy to prove i i+1 Γi i that Wi+1 ⊇ Wi and Γi+1 ≤ Γi for all i ∈ Z≥0. The algorithm terminates when the current ϕ group Γi is not a giant or the window stops growing. Let ` be the value of the variable i at the end of while-loop. Furthermore let W := W`. Note ∗ that {Wj | 1 ≤ j ≤ `} forms a partition of the set W . We first argue the algorithm is correct. For every 0 ≤ j ≤ ` it holds that AutΓ(x) ≤ Γj ≤ Γ. ϕ ϕ ϕ First suppose Γ` 6≥ Ak. Then Γ` forms a certificate of non-fullness. Otherwise Γ` ≥ Ak and W = Aff(Γ`, ϕ). By Observation 6.1.4 the group Γ` ≤ Γ has an almost d-ary structure tree. ϕ Hence, ((Γ`)(Ω\W )) ≥ Ak by Theorem 6.2.7. Furthermore, it easy to check that Γ` respects the string x on all positions in Wj for all 0 ≤ j ≤ `. So (Γ`)(Ω\W ) ≤ AutΓ(x) because it respects all positions in the set W and fixes all positions outside of W . This means (Γ`)(Ω\W ) forms a certificate of fullness. ∗ It remains to analyze the running time of the algorithm. Let nj := |Wj | for all j ∈ [`] and consider the following two cases. P` Case nj ≤ n/2 for all j ∈ [`]: Then j=1 nj = |W | ≤ n. Also, in the j-th iteration of the while loop the procedure makes one recursive calls to the String Isomorphism Problem ∗ over domain size |Wj | = nj. All other steps can be performed in polynomial time using Theorem 5.1.4.

∗ Case nj > n/2 for a unique j ∈ [`]: Let A1,...,Am be the orbits of N[Wj ]. Note that nj = P ∗ i∈[m] |Ai|. Since each point α ∈ Wj is affected by ϕ with respect to the group Γj−1 it holds that |Ai| ≤ |Wj|/k ≤ n/2 for all i ∈ [m] by Lemma 5.5.9. Using orbit-by- ∗ Wj orbit processing the algorithm computes AutNγ (x) making m recursive calls to the String ϕ Isomorphism Problem over domain sizes |A1|,..., |Am|. Since |Γj−1| ≤ k! = t! this amounts to, for each i ∈ [m], making at most t! recursive calls to the String Isomorphism Problem c over domain size |Ai|. In this case, all other steps can be performed in time O(t!n ) using ∗ Theorem 5.1.4. For the other windows Wi , i ∈ [`], i 6= j, the analysis is the same as in the previous case.

6.4.2 Comparing Local Certificates The algorithm described in the last subsection only works for a single string. However, the input to the String Isomorphism Problem consists of two strings that must be compared. If the algorithm described above returns a certificate of non-fullness this may actually not be very helpful since it only restricts possible automorphisms of the first string, but it does not restrict the set of candidate-isomorphisms to the second string. The following lemma resolves this problem by adapting the Local Certificates Routine accordingly.

Lemma 6.4.3. Let x1, x2 :Ω → Σ be two strings and Γ ≤ Sym(Ω) a group that has an almost d-ary structure tree. Furthermore suppose there is a giant representation ϕ:Γ → Sk. Let T1,T2 ⊆ [k] be sets of equal size t := |T1| = |T2| > max{8, 2 + log2 d} and suppose T1 is not full with respect to x1. P` Then there are natural numbers n1, . . . , n` ≤ n/2 such that i=1 ni ≤ n and, for each i ∈ [`] using t! recursive calls to the String Isomorphism Problem over domain size ni and 6.4. LOCAL CERTIFICATES 107

c O(t!n ) additional computation, one can compute a non-giant group Λ ≤ Sym(T1) and a bijection λ: T1 → T2 such that

n ϕ o γϕ| γ ∈ Iso (x , x ) ∧ T (γ ) = T ⊆ Λλ. (6.3) T1 Γ 1 2 1 2 Moreover, the set of bijections Λλ is isomorphism-invariant for the two test sets with respect to x1, x2, Γ and the giant representation ϕ. At this point, it may not be clear what it means for Λλ to be isomorphism-invariant. For 0 0 this work, the following will be sufficient. Let T1,T2 ⊆ [k] be another pair of test sets such that 0 ϕ(γi) 0 0 Ti = Ti for some γi ∈ AutΓ(xi) for both i ∈ {1, 2}. Also, let Λ λ be the set of bijections 0 0 computed by the algorithm for the the pair T1,T2. Then 0 0 −1  −1  −1 Λ λ = ϕ(γ1 )Λϕ(γ1) ϕ(γ1 )λϕ(γ2) = ϕ(γ1 )Λλϕ(γ2) restricting all mappings accordingly. Proof. The proof of this lemma is very similar to the proof of Lemma 6.4.2. Consider Algorithm 6. ψ First suppose towards a contradiction there is some i such that Wi+1 = Wi. Then ((Γi)(Ω\Wi)) ≥ Alt(T ) by Theorem 6.2.7. Furthermore (Γ ) ≤ Aut (x ). So together (Aut (x ))ψ ≥ 1 i (Ω\Wi) Γ 1 ΓT1 1 Alt(T1) contradicting the fact that T1 is not full with respect to the string x1. So the algorithm terminates and returns a non-giant group Λ ≤ Sym(T1) and a bijection λ: T1 → T2 with the desired properties. The complexity analysis is completely analogous to Lemma 6.4.2. 0 0 Finally, for the isomorphism-invariance of the set Λλ let T1,T2 ⊆ [k] be a second pair of test 0 ϕ(γi) 0 0 sets such that Ti = Ti for some γi ∈ AutΓ(xi) for both i ∈ {1, 2}. Let Γiσi be the sets com- 0 0 0 0 −1 −1 puted by the algorithm for the test sets T1,T2. It suffices to argue that Γiσi = γ1 Γiγ1γ1 σiγ2. This statement can be proved by induction on i. For i = 0 this follows immediately from 0 the definitions. For the inductive step first consider the window Wi+1 = Aff(Γi, ψ) and Wi+1 = 0 0 0 γ1 Aff(Γi, ψ ). Then Wi+1 = Wi+1. So

0 0 0 Wi Γi+1σi+1 = Iso 0 0 (x1, x2) I.H. Γiσi γ1 Wi = Iso −1 (x1, x2) Equation (5.4) γ1 Γiσiγ2 γ1 −1 Wi γ2 = Iso −1 (x1, x2 )γ2 γ1 Γiσi −1 γ γ W 1 1 γ−1 γ−1 = γ−1 Iso i (x 1 , x 2 )γ γ ∈ Aut (x ) 1 Γiσi 1 2 2 i Γ i = γ−1 IsoWi (x , x )γ 1 Γiσi 1 2 2 −1 = γ1 Γi+1σi+1γ2. By the induction principle this implies the isomorphism-invariance of the output of the algorithm.

6.4.3 Aggregating Local Certificates

Let x1, x2 :Ω → Σ be two strings, Γ ≤ Sym(Ω) a group that has an almost d-ary structure tree and ϕ:Γ → Sk a giant representation. Recall that in this situation the goal is either to ϕ find a group ∆ ≤ AutΓ(x1) such that ∆ is large or computing (a small set of) isomorphism- invariant relational structures Ai, i ∈ {1, 2}, defined on the set [k] that are far away from being 108 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Algorithm 6: CompareLocalCertificates

Input : Strings x1, x2 :Ω → Σ, a group Γ ≤ Sym(Ω) that has an almost d-ary structure tree, ϕ:Γ → Sk, and T1,T2 ⊆ [k] of size t > max{8, 2 + log2 d} such that T1 is not full with respect to x1. Output: A non-giant group Λ ≤ Sym(T1) and bijection λ: T1 → T2 such that

n ϕ o γϕ| γ ∈ Iso (x , x ) ∧ T (γ ) = T ⊆ Λλ. T1 Γ 1 2 1 2

ϕ (σ0 ) 1 compute σ0 ∈ Γ such that T1 = T2

2 Γ0 := ΓT1 3 W0 := ∅ 4 i := 0 5 ψ :Γ0 → Sym(T1) is the homomorphism obtained from ϕ by restricting the image to the set T1 ψ 6 while Γi ≥ Alt(T1) do 7 Wi+1 := Aff(Γi, ψ) ∗ 8 Wi+1 := Wi+1 \ Wi ∗ 1 9 if |Wi+1| ≤ |Ω| then 2 ∗ Wi+1 10 Γ σ := Iso (x , x ) i+1 i+1 Γiσi 1 2 11 else 12 Γi+1 := ∅ ψ 13 N := {γ ∈ Γi | γ = id} 14 ` := 0 ψ 15 for γ ∈ Γi do 16 computeγ ¯ ∈ Γi with ψ(¯γ) = γ ∗ Wi+1 17 ∆ δ := Iso (x , x ) ` ` Nγσ¯ i 1 2 18 ` := ` + 1 19 end S 20 Γi+1σi+1 := j≤` ∆jδj 21 end 22 i := i + 1 23 end ψ ϕ 24 return (Γi , (σi )|T1 ) 6.4. LOCAL CERTIFICATES 109 symmetric. To achieve this goal, an algorithm computes local certificates for all test sets and all pairs of test sets of logarithmic size. The certificates of fullness can be combined into a group of automorphisms whereas certificates of non-fullness may be combined into a relational structure that has only few automorphisms. Up to this point, it is still not clear what exactly it means for a relational structure to be far away from being symmetric. This is formalized in a precise way by the next definition. Definition 6.4.4 (Symmetry Defect). Let Γ ≤ Sym(Ω) be a group. The symmetry defect of Γ is the minimal t ∈ [n] such that there is a set M ⊆ Ω of size |M| = n − t such that Alt(M) ≤ Γ (the group Alt(M) fixes all elements of Ω \ M). In this case the relative symmetry defect of Γ is t/n. For any relational structure A we define the (relative) symmetry defect of A to be the (relative) symmetry defect of its automorphism group Aut(A). A crucial property is that groups of large symmetry defect are much smaller than the giants.

Theorem 6.4.5 (cf. [45], Theorem 5.2 A,B). Let An ≤ S ≤ Sn and suppose n > 9. Let Γ ≤ S n and r < n/2. Suppose that |S :Γ| < r . Then the symmetry defect of Γ is strictly less than r. Actually, the following corollary turns out to be sufficient for us.

Corollary 6.4.6. Let An ≤ S ≤ Sn be a giant group and suppose n ≥ 24. Let Γ ≤ S and suppose the relative symmetry defect of Γ is at least 1/4. Then |S :Γ| ≥ (4/3)n. n Proof. Let r = bn/4c. Then the symmetry defect of Γ is at least r. Hence, |S :Γ| ≥ r by Theorem 6.4.5. Moreover,

bn/4c  n   n  1 √ n ≥ ≥ 4(n/4)−1 = · 2 . bn/4c bn/4c 4 √ 1 n n Since n ≥ 24 it holds that 4 · 2 ≥ (4/3) .

Lemma 6.4.7. Let x1, x2 :Ω → Σ be two strings and Γ ≤ Sym(Ω) a group that has an almost d-ary structure tree. Furthermore suppose there is a giant representation ϕ:Γ → Sk. Let max{8, 2 + log2 d} < t < k/10. P` O(t) Then there are natural numbers ` ∈ N and n1, . . . , n` ≤ n/2 such that i=1 ni ≤ k n and, for each i ∈ [`] using a recursive call to the String Isomorphism Problem over domain size ni, and kO(t)nc additional computation, one obtains for i = 1, 2 one of the following:

6 1. a family of r ≤ k many t-ary relational structures Ai,j, for j ∈ [r], associated with xi, each 3 1 with domain Vi,j ⊆ [k] of size |Vi,j| ≥ 4 k and with relative symmetry defect at least 4 such that ϕ(γ) {A1,1,..., A1,r} = {A2,1,..., A2,r} for every γ ∈ IsoΓ(x1, x2), or

3 2. a subset Mi ⊆ [k] associated with xi of size |Mi| ≥ 4 k and ∆i ≤ AutΓM (xi) such that ϕ i (∆i )[Mi] ≥ Alt(Mi) and

ϕ(γ) M1 = M2 for every γ ∈ IsoΓ(x1, x2).

The proof is completely analogous to the proof of [11, Theorem 24] replacing the methods to compute the local certificates. In order to present the proof we first require some additional background. 110 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Definition 6.4.8 (Degree of Transitivity). A permutation group Γ ≤ Sym(Ω) is t-transitive if its natural induced action on the set of n(n−1) ... (n−t+1) ordered t-tuples of distinct elements is transitive. The degree of transitivity d(Γ) is the largest t such that Γ is t-transitive. Theorem 6.4.9 (CFSG). Let Γ ≤ Sym(Ω) be a non-giant group. Then d(Γ) ≤ 5. A slightly weaker statement, namely d(Γ) ≤ 7 for all non-giants permutation groups, can be shown using only Schreier’s Hypothesis (see [45, Theorem 7.3A]). A graph G is regular if every vertex has the same degree d. A regular graph on n vertices is non-trivial if 0 < d < n − 2, i.e., the graph G is not the complete graph and contains at least one edge. Lemma 6.4.10 (cf. [10], Corollary 2.4.13). Let G be a non-trivial regular graph. Then the relative symmetry defect of G is at least 1/2. Proof of Lemma 6.4.7. For every t-element subset T ⊆ [k] determine whether T is full (with respect to xi) and compute a corresponding certificate using Lemma 6.4.2. Let Λi ≤ Sym(Ω) be the group generated by the fullness-certificates for all full subsets T ⊆ [k] with respect to string xi. Note that Λi ≤ AutΓ(xi) for both i ∈ [2]. Also observe that the group Λi is defined −1 in an isomorphism-invariant manner meaning that γ Λ1γ = Λ2 for all γ ∈ IsoΓ(x1, x2). Let ϕ ϕ (Λi ) Si ⊆ [k] be the support of Λi , i.e., Si := {α ∈ [k] | |α | ≥ 2}. Once again, the sets Si are ϕ(γ) isomorphism-invariant meaning that S1 = S2 for every γ ∈ IsoΓ(x1, x2). In particular, we may assume that |S1| = |S2| (otherwise the strings are not Γ-isomorphic and we may output some trivial non-isomorphic relational structures). Now we distinguish between three cases depending on the size of Si.

1 3 Case 4 k ≤ |Si| ≤ 4 k: This case is simple by setting r := 1, A1,1 := ([k],S1), and A2,1 := ([k],S2). It is easy to verify this satisfies Option 1 of the Lemma.

3 ϕ Case |Si| > 4 k: We further distinguish between three subcases. First assume every orbit of Λi 3 ϕ has size at most 4 k. Then the partition into the orbits of Λi gives a canonical structure 1 Ai,1 with domain [k] and relative symmetry defect at least 4 . More precisely, set r := 1 and define (Λϕ) Ai,1 := (Si, {(α, β) | α ∈ Si, β ∈ α i }).

3 ϕ So suppose there is a (unique) orbit Mi ⊆ [k] of size Mi ≥ 4 k. If Λi [Mi] ≥ Alt(Mi) then the second option of the Lemma is satisfied. ϕ Hence suppose Λi [Mi] is not a giant. By Theorem 6.4.9 the degree of transitivity satisfies ϕ ϕ d(Λi [Mi]) ≤ 5. Let Fi ⊆ Mi be an arbitrary set of size d(Λi [Mi]) − 1 and individualize ϕ 0 0 the elements of Fi. Then (Λi )(Fi)[Mi ] is transitive, but not 2-transitive, where Mi = 4 Mi \ Fi. Note that the number of possible choices for the set Fi is at most k . Now let 0 ϕ 0 Xi = (Mi ,Ri,1,...,Ri,p) be the orbital configuration of (Λi )(Fi) on the set Mi , that is, the ϕ 0 0 relations Ri,j are the orbits of (Λi )(Fi) in its natural action on Mi × Mi . Note that p ≥ 3 ϕ 0 since (Λi )(Fi)[Mi ] is not 2-transitive. Also observe that the numbering of the Ri,j, j ∈ [p], is not canonical (isomorphisms may permute the Ri,j). Without loss of generality suppose that Ri,1 is the diagonal. Now individualize one of the Ri,j for j ≥ 2 at a multiplicative −1 cost of p − 1 ≤ k − 1. If Ri,j is undirected (i.e., Ri,j = Ri,j ) then it defines a non-trivial regular graph. Since the symmetry defect of this graph is at least 1/2 (see Lemma 6.4.10) this gives us the desired structure. Otherwise Ri,j is directed. If the out-degree of a vertex 0 0 −1 is strictly less (|Mi | − 1)/2 then the undirected graph Gi = (Mi ,Ri,j ∪ Ri,j ) is again a non-trivial regular graph. Otherwise, by individualizing one vertex (at a multiplicative cost 6.4. LOCAL CERTIFICATES 111

0 of |Mi | ≤ k), one obtains a coloring of symmetry defect at least 1/2 by coloring vertices depending on whether they are an in- or out-neighbor of the individualized vertex.

1 3 Case |Si| < 4 k: Let Di = [k] \ Si. Then |D1| = |D2| ≥ 4 k. Observe that every T ⊆ Di is not 0 full with respect to the string xi. Let Di = Di × {i} (to make the sets disjoint).

Consider the following category L. The objects are the pairs (T, i) where T ⊆ Di is a t-element subset. The morphisms (T, i) → (T 0, i0) are the bijections computed in Lemma 6.4.3 for the test sets T and T 0 along with the corresponding strings. The morphisms 0 hti 0 hti 0 hti define an equivalence relation on the set (D1) ∪ (D2) where (Di) denotes the set of 0 all ordered t-tuples with distinct elements over the set Di. Let R1,...,Rr be the equivalence 0 hti 0 classes and define Rj(i) = Rj ∩(Di) . Then Ai = (Di,R1(i),...,Rr(i)) is a canonical t-ary relational structure. Moreover, the symmetry defect of Ai is at least |Di| − t + 1 ≥ |Di|/4.

The aggregation process described in Lemma 6.4.7 either gives a small set of isomorphism- invariant structures with large symmetry defect or a large set of automorphisms. Both outcomes can be used to significantly reduce the size of the group Γ as detailed in the following two lemmas.

Lemma 6.4.11. Suppose Option 1 of Lemma 6.4.7 is satisfied, yielding a number r ≤ k6 and a set of relational structures Ai,j for i ∈ [2], j ∈ [r]. Then there are subgroups Λj ≤ Γ and elements λj ∈ Sym(Ω) for j ∈ [r] such that [ IsoΓ(x1, x2) = IsoΛj λj (x1, x2), (6.4) j∈[r]

ϕ ϕ k and |Γ :Λj | ≥ (4/3) for all j ∈ [r]. Moreover, given all the relational structures Ai,j for i ∈ [2], j ∈ [r], the groups Λj and O(tc(log k)c) c elements λj can be computed in time k n for some constant c.

Proof. Let Vi,j := V (Ai,j) ⊆ [k] be the domain of Ai,j for all i ∈ [2] and j ∈ [r]. Let A1 := A1,1 and also V1 := V1,1. Now define

(γϕ) ϕ Λjλj := {γ ∈ Γ | (V1) = V2,j ∧ (γ )|V1 ∈ Iso(A1, A2,j)}.

O(tc(log k)c) c First observe the sets Λjλj can be computed in time k n for some constant c by Corollary 5.5.3 and Theorem 5.1.4. ϕ(γ) Also A1 ∈ {A2,1,..., A2,r} for every γ ∈ IsoΓ(x1, x2). This implies that [ IsoΓ(x1, x2) = IsoΛj λj (x1, x2). j∈[r]

ϕ 1 ϕ ϕ k Finally recall that the symmetry defect of Λj is at least 4 . So |Γ :Λj | ≥ (4/3) by Corollary 6.4.6.

Remark 6.4.12. The proof of the lemma above is the only place in the entire algorithm which uses Babai’s quasipolynomial time isomorphism test [11] as a black box (via Corollary 5.5.3). I remark at this point that without using Babai’s algorithm the overall running time of the method for testing isomorphism of two graphs of maximum degree d only increases to d! · npolylog(d). This is still significantly faster than Luks’s algorithm [106]. Indeed, the only difference occurs in the last lemma. Instead of using Babai’s algorithm for testing isomorphism of the relational structures, 112 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS the algorithm simply performs a brute-force isomorphism test which amounts to a running time of k! · nc. Note that, while the algorithm presented in this chapter makes heavy use of the group- theoretic techniques introduced in Babai’s algorithm, it only relies on the combinatorial ad- vances via calling Babai’s algorithm as a black box. Indeed, the combinatorial methods build a central ingredient for achieving the quasipolynomial run time bound for arbitrary graphs [11]. Maybe surprisingly, for graphs of bounded degree, the above statements imply that significant improvements are possible without using such methods.

Lemma 6.4.13. Suppose Option 2 of Lemma 6.4.7 is satisfied, yielding sets Mi ⊆ [k] and groups ∆ ≤ Aut (x ) for i ∈ {1, 2}. Then there is a number r ∈ {1, 2}, a subgroup Λ ≤ Γ and i ΓMi i elements λj ∈ Sym(Ω) for j ∈ [r] such that ∼ ∼ 1. x1 =Γ x2 if and only if x1 =Λλj x2 for some j ∈ [r], and given representations for the sets

IsoΛλj (x1, x2) for all j ∈ [r] and a generating set for ∆1 one can compute in polynomial time a representation for IsoΓ(x1, x2), and 2. |Γϕ :Λϕ| ≥ (4/3)k.

Moreover, given the sets Mi for both i ∈ {1, 2}, the group Λ and the elements λj can be computed in polynomial time.

−1 ϕ Proof. Let Λ := Γ(M1) (recall that Γ(T ) = ϕ ((Γ )(T )) for T ⊆ [k]). Pick γ ∈ Γ such that ϕ(γ) ϕ M1 = M2 and τ ∈ ΓM1 such that τ [M1] is a transposition. Now define λ1 := γ and λ2 := τγ. ∼ ∼ ϕ Then x1 =Γ x2 if and only if x1 =Λλj x2 for some j ∈ {1, 2} since (∆1 )[M1] ≥ Alt(M1). Moreover, S ϕ ϕ if Γjγj = IsoΛλj (x1, x2) then IsoΓ(x1, x2) = j=1,2h∆1, Γjiγj. Finally, |Γ :Λ | ≥ | Alt(M1)| ≥ (4/3)k.

6.5 String Isomorphism

After adapting the Local Certificates Routine to our setting we are now ready to formalize the main algorithm solving the String Isomorphism Problem for Γbd-groups. Recall that we already showed that it suffices to consider permutation groups that are equipped with an almost d-ary structure tree. The basic strategy to tackle the problem for such groups is to follow Luks’s algo- rithm along the given structure tree. Whenever feasible the algorithm simply performs standard Luks reduction (see Subsection 5.2.3). Otherwise, when the recurrence obtained from the stan- dard Luks reduction does not yield the desired running time, we can find a giant representation allowing us to apply the Local Certificates Routine. This is formalized by the next lemma which is an easy consequence of Theorem 5.5.5 and Lemma 6.1.15.

1+log d Lemma 6.5.1. Let Γ ≤ Sd be a primitive group of order |Γ| ≥ d where d is greater than some absolute constant. Then there is a polynomial-time algorithm computing a normal subgroup N ≤ Γ of index |Γ: N| ≤ d, an N-invariant equipartition B, and a giant representation ϕ: N → Sk where k ≥ log d and ker(ϕ) = N(B). Lemma 6.5.2. Let Γ ≤ Sym(Ω) be a transitive permutation group and let x, y:Ω → Σ be two + strings. Also, suppose there is an almost d-ary structure tree (T, v0) for Γ such that deg (v0) ≤ d. P` O((log d)3) Then there are natural numbers ` ∈ N and n1, . . . , n` ≤ n/2 such that i=1 ni ≤ 2 n and, for each i ∈ [`] making one recursive call to the String Isomorphism Problem over domain O((log d)c) c size at most ni, and d n additional computation, one can compute a representation for IsoΓ(x, y). 6.5. STRING ISOMORPHISM 113

Proof. The structure tree (T, v0) gives a sequence of Γ-invariant partitions {Ω} B1 · · · Bm = {{α} | α ∈ Ω}. Let B B1 be a minimal block system of the group Γ and let ∆ := Γ[B] denote the induced action of Γ on B. Note that |B| ≤ d. If |∆| ≤ d1+log d the statement of the lemma follows from applying standard Luks reduction. Otherwise, using Lemma 6.5.1, the algorithm computes a normal subgroup N ≤ ∆ of index |∆ : N| ≤ d, an N-invariant equipartition C, and a giant representation ψ : N → Sk where k ≥ log d and ker(ψ) = N(B). First observe k ≤ d since the permutation degree of N is bounded by d. We lift the normal subgroup N and the partition C from ∆ to Γ obtaining a group Γ0 := {γ ∈ Γ | γ[B] ∈ N} and a partition 0 S 0 0 C := { B∈C B | C ∈ C} (recall that C forms a partition of the set B). Clearly C is Γ -invariant. Since |Γ:Γ0| ≤ d it suffices to prove the statement for the group Γ0 (introducing an additional factor of d for the number of recursive calls). Finally, we can also lift the homomorphism 0 0 ψ ψ to the group Γ obtaining a giant representation ϕ:Γ → Sk : γ 7→ (γ[B]) . Note that 0 ker(ϕ) = (Γ )(C0). Let t := max{9, 3 + log d}. In case k ≤ 10t the statement follows again by 0 0 0 O((log d)2) standard Luks reduction. (In this case |Γ : (Γ )(C0)| = |Γ : ker(ϕ)| ≤ k! ≤ 2 .) So suppose max{8, 2+log d} < t < k/10. In this case the requirements of Lemma 6.4.7 are satisfied. Using Lemma 6.4.7, 6.4.11 and 6.4.13 we can reduce the problem (using additional recursive calls to the String Isomorphism Problem over domain size at most n/2) to at most k6 instances of Λ-isomorphism for groups Λ ≤ Γ0 where |(Γ0)ϕ :Λϕ| ≥ (4/3)k. Applying the same argument to these instances of Λ-isomorphism and repeating the process until we can afford to perform standard Luks reduction gives our desired algorithm. It remains to analyze its running time, that is, we need to analyze the number of times this process has to be repeated until the algorithm reaches a sufficiently small group to perform stan- dard Luks reduction. Towards this end, we analyze the parameter k of the giant representation and show that it has to be reduced in each round by a certain amount. Recall that the algorithm performs standard Luks reduction as soon as k ≤ 10t. Consider the recursion tree of the algorithm (ignoring the additional recursive calls to the String Isomorphism Problem over domain size at most n/2 for the moment). Recall that C0 is Γ0-invariant and thus, it is also Λ-invariant. In case Λ is not transitive it is processed orbit by orbit. Note that there is at most one orbit of size greater than n/2 that has to be considered in the current recursion (for the other orbits additional recursive calls to the String Isomor- phism Problem over domain size at most n/2 suffice and these recursive calls are ignored for 0 0 the moment). Let ϕ :Λ → Sk0 be the giant representation computed on the next level of the recursion where Λ0 is the projection of Λ00 to an invariant subset of the domain for some Λ00 ≤ Λ (if no giant representation is computed then the algorithm performs standard Luks reduction 0 0 0 (k )! 0 ϕ0 and the node on the next level is a leaf). Observe that |Λ [C ]| ≥ 2 because (Λ ) ≥ Ak0 and 0 0 0 k! 0 Λ(C0) ≤ ker(ϕ ). Also note that |Λ[C ]| ≤ (4/3)k since ker(ϕ) = Γ(C0) by Lemma 6.5.1. So

(k0)! k! ≤ . 2 (4/3)k

Hence, 0 0 (4/3)k ≤ 2 · 2(k−k ) log k ≤ (4/3)3(k−k ) log k since k is sufficiently large. So k k0 ≤ k − . 3 log k It follows that the height of the recursion tree is O((log d)2). Thus, the number of nodes of 2 3 the recursion tree is bounded by dO((log d) ) = 2O((log d) ). By Lemma 6.4.7, 6.4.11 and 6.4.13 each node of the recursion tree makes recursive calls to String Isomorphism over domain sizes 114 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

P O((log d)2) O((log d)c) c ni ≤ n/2 where i ni ≤ 2 n and uses additional computation d n for some constant c. Putting this together, the desired bound follows. Theorem 6.5.3. There is an algorithm that, given a permutation group Γ ≤ Sym(Ω), two strings x, y:Ω → Σ and an almost d-ary structure tree for Γ, computes a representation for IsoΓ(x, y) in c time nO((log d) ), for an absolute constant c.

Algorithm 7: String Isomorphism Input : Γ ≤ Sym(Ω), x, y:Ω → Σ two strings and an almost d-ary sequence of Γ-invariant partitions {Ω} B1 · · · Bm = {{α} | α ∈ Ω}. Output: IsoΓ(x, y)

1 if Γ is not transitive then 2 recursively process group orbit by orbit /* restrict partitions to orbits */ 3 return IsoΓ(x, y) 4 else 5 if Γ[B1] is semi-regular then

6 apply standard Luks reduction /* restrict partitions to orbits of Γ(B1) */ 7 return IsoΓ(x, y) 8 else /* assumptions of Lemma 6.5.2 are satisfied */ 9 apply Lemma 6.5.2 10 return IsoΓ(x, y) 11 end 12 end

Proof. The pseudo-code is given in Algorithm 7. If the input group Γ is not transitive the group is processed orbit by orbit (see Subsection 5.2.3). If the action of Γ on the block system B1 is semi-regular, the algorithm applies standard Luks reduction to compute the set IsoΓ(x, y) (again see Subsection 5.2.3). Otherwise Γ is transitive and |B1| ≤ d (recall that {Ω} B1 · · · Bm = {{α} | α ∈ Ω} is an almost d-ary sequence of Γ-invariant partitions). Then Lemma 6.5.2 can be applied to recursively compute IsoΓ(x, y). Clearly, the algorithm computes the desired set of isomorphisms. The bound on the run- ning time follows from Lemma 6.3.2. Note that the bottleneck is the type of recursion used in Lemma 6.5.2. Also observe every group ∆, for which the algorithm performs a recursive call, is the projection of a subgroup of Γ to an invariant subset of the domain. Hence, by restricting the partitions B1,..., Bm to the domain of ∆ one obtains a sequence of partitions for the group ∆ with the desired properties (cf. Observation 6.1.4). Combining Corollary 6.1.18 and Theorem 6.5.3 gives the main technical result of this chapter.

O((log d)c) Theorem 6.5.4. The String Isomorphism Problem for Γbd-groups can be solved in time n for some constant c. Remark 6.5.5 (Dependence on the Classification of Finite Simple Groups). The proof of the last theorem at several points is based on the Classification of Finite Simple Groups. Most notably, this is the case for the Classification Theorem of primitive Γbd-groups (Theorem 5.4.16) which rests on an analysis of the primitive groups of the different types. Also, the proof of our variant of the Unaffected Stabilizers Theorem rests on the Classification of Finite Simple Groups via Lemma 6.2.3. However, by increasing the polylogarithmic bounds in d, this dependence 6.6. APPLICATIONS 115 can be removed [132]. Moreover, also the proof of Theorem 5.5.5 requires the Classification of Finite Simple Groups via Cameron’s Theorem [33] exactly classifying primitive groups Γ of size |Γ| ≥ n1+log n. Once again, this dependence can be removed by increasing the polylogarithmic terms in the proofs (see, e.g., [11, Section 5.1]). Finally, the upper bound on the degree of transitivity given in Theorem 6.4.9 also depends on the Classification of Finite Simple Groups. But also this dependence can be removed replacing the bound on the degree of transitive by a weaker logarithmic bound (see, e.g., [11, Section 5.1]) So overall, by slightly modifying the algorithm, the only dependence on the Classification of Finite Simple Groups that remains lies in the proof of Theorem 5.4.16.

6.6 Applications

With Luks’s algorithm being applied as a subroutine for many algorithms tackling the Graph Isomorphism Problem it is natural to ask for the consequences of Theorem 6.5.4. In this section, we give several immediate consequences whereas a deeper application for isomorphism testing of graphs of bounded tree-width is presented in the next chapter.

6.6.1 Isomorphism for Structures of Bounded Degree Of course, a direct consequence of the improved algorithm for the String Isomorphism Problem for Γbd-groups is a faster isomorphism test for graphs of bounded degree.

Theorem 6.6.1. The Graph Isomorphism Problem for graphs of maximum degree d can be solved c in time nO((log d) ) for some constant c.

Proof. This follows from Theorem 5.3.4 and 6.5.4.

A related, but to some extend slightly stronger statement, that plays a role in the next chapter, is the following. Recall the definition of a rooted simple acyclic graph (G, v0) given in Subsection 6.1.2.

Theorem 6.6.2. The Graph Isomorphism Problem for rooted simple acyclic graphs (G, v0) such c that deg+(v) ≤ d for all v ∈ V (G) can be solved in time nO((log d) ) for some constant c.

Proof. By Remark 5.3.5 the reduction described in Theorem 5.3.4 can also be used to reduce + the isomorphism problem for rooted simple acyclic graphs (G, v0) such that deg (v) ≤ d to the String Isomorphism Problem for Γbd-groups. So the statement follows in combination with Theorem 6.5.4.

Using similar arguments there are also severe restrictions on the automorphism group of a rooted simple acyclic graph (G, v0).

+ Lemma 6.6.3. Let (G, v0) be a rooted simple acyclic graph such that deg (v) ≤ d for all v ∈ V (G). Then Aut(G, v0) ∈ Γbd.

Proof. The proof is analogous to the to the proof of Theorem 5.3.1.

Actually, instead of restricting to graphs, one can also consider the isomorphism problem for relational structures (cf. Corollary 5.5.3). 116 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

Let A = (V,R1,...,Rk) be a relational structure and let ri denote the arity of the relation ri Ri, that is, Ri ⊆ V for all i ∈ [k]. Recall that the structure A is t-ary if ri ≤ t for every i ∈ [k]. For R ⊆ V ` we define the Gaifman graph G(R) = (V,E(R)) where

E(R) := {vw | ∃(v1, . . . , v`) ∈ R ∃i, j ∈ [`]: v = vi ∧ w = vj}.

The structure A has maximum degree at most d if G(Ri) has maximum degree at most d for every i ∈ [k]. Theorem 6.6.4. The Isomorphism Problem for t-ary relational structures of maximum degree c d can be solved in time nO(t·(log d) ) for some constant c.

Proof. Let A1 = (V,R1,...,Rk) and A2 = (V,S1,...,Sk) be two relational structures where ri Ri,Si ⊆ V for numbers ri ≤ t. The structure A1 is connected if the graph (V,E(R1) ∪ · · · ∪ E(Rk)) is connected. By treating the connected components of A1 and A2 independently it suffices to consider the case that A1 and A2 are connected. Let v1 ∈ V be an arbitrary vertex. For each v2 ∈ V we check whether there is an iso- ∼ morphism ϕ: A1 = A2 such that ϕ(v1) = v2. This can be done as follows. Let G1 = (V,E(R1),...,E(Rk), v1) be the vertex- and edge-colored graph where the edge color classes are given by E(R1),...,E(Rk) and the vertex v1 is individualized. Similarly, define the graph G2 = (V,E(S1),...,E(Sk), v2). Since A1 and A2 have maximum degree d the set of isomorphisms O((log d)c) Iso(G1,G2) from G1 to G2 can be computed in time n for some constant c by Theorem 5.3.4, Remark 5.3.5 and Theorem 6.5.4. Moreover, Iso(G1,G2) = Γγ for some Γ ≤ Sym(V ) and γ ∈ Sym(V ) where Γ ∈ Γbd by Theorem 5.3.1 and Remark 5.3.5. Now consider the set Ω := V ≤t of all tuples over V of length at most t. Note that |Ω| = nO(t). [k] [k] Also define the strings x1 :Ω → 2 :v ¯ 7→ {i ∈ [k] | v¯ ∈ Ri} and x2 :Ω → 2 :v ¯ 7→ {i ∈ [k] | ∼ ∼ v¯ ∈ Si}. Then A1 = A2 if and only if σ : x1 = x2 for some σ ∈ Γ[Ω]γ[Ω] where Γ[Ω] (resp. γ[Ω]) denotes the natural componentwise action of Γ (resp. γ) on the set Ω. Since Γ ∈ Γbd this can be c checked in time nO(t·(log d) ) by Theorem 6.5.4. Note that this theorem significantly improves on the algorithm from Corollary 5.5.3 by dras- tically reducing the impact of the arity of the input structures as well as taking the degree into account. Maybe surprisingly, beyond implications on problems directly related to the Graph Isomor- phism Problem, the last theorem is also applied by van Bergerem [147] in the context of learning concepts definable by first-order formulas with counting. Another direct consequence of the last theorem is an isomorphism test for hypergraphs (since every hypergraph can be viewed as a relational structure) which improves on the algorithm of Babai and Codenotti [15]. Corollary 6.6.5. The Isomorphism Problem for hypergraphs of maximum arity t and maximum c degree d can be solved in time nO(t·(log d) ) for some constant c.

6.6.2 Coset-Labeled Hypergraphs

Actually, algorithms for the String Isomorphism Problem of Γbd-groups can be applied to an isomorphism problem that, in some sense, is even more general than the ones described above. Intuitively speaking, one way to think about this problem is to consider the isomorphism problem for hypergraphs, but instead of only requiring that hyperedges must be mapped to hyperedges, for each pair of a hyperedge E1 in the first graph and a hyperedge E2 in the second graph, one restricts the possible set of allowed mappings σ : E1 → E2 if indeed E1 is mapped to E2. 6.6. APPLICATIONS 117

This type of isomorphism problem naturally arises when applying decomposition techniques to a graph which is discussed in the next chapter (see also [70]). Also, in [137] Schweitzer and Wiebking consider a generalization of this problem, where each hyperedge may appear multiple times, in order to give a unifying method to tackle isomorphism problems for a large class of objects [137]. In the following, I slightly deviate from the exact definitions given in [70, 137]. A prototype is a group Γ ≤ Sym([n]) for some n ∈ N.A list of prototypes is a tuple P = (Γ1,..., Γk) of prototypes where Γi ≤ Sym([ni]) for some ni ∈ N.

Definition 6.6.6 (Coset-Labeled Hypergraphs). Let P = (Γ1,..., Γk) be a list of prototypes and let ni be the degree of Γi.A P-labeled hypergraph is a tuple H = (V, E, p) where V is a finite set of vertices, E ⊆ 2V a set of hyperedges, and p is a function that associates with each E ∈ E a pair p(E) = (i, θ) where θ : E → [ni] is a bijection. In particular, |E| = ni. Two P-labeled hypergraphs H1 = (V1, E1, p1) and H2 = (V2, E2, p2) are isomorphic if there is a bijection σ : V1 → V2 such that

σ V1 1. E ∈ E1 if and only if E := {σ(v) | v ∈ E} ∈ E2 for all E ⊆ 2 , and

σ 2. p1(E) = (i, θ1), p2(E ) = (i, θ2) and there is some γ ∈ Γi such that

−1 σ[E] = θ1γθ2 (6.5)

for all E ∈ E1. ∼ In this case σ is an isomorphism from H1 to H2, denoted by σ : H1 = H2. For Γ ≤ Sym(V1) and a bijection θ : V1 → V2 let ∼ IsoΓθ(H1, H2) := {σ ∈ Γθ | σ : H1 = H2}. (6.6)

Note that, for two P-labeled hypergraphs H1 and H2, the set of isomorphisms Iso(H1, H2) forms a coset of Aut(H1) and therefore, it admits a compact representation similar to the iso- morphism problems for strings and graphs discussed in earlier parts of this thesis. Indeed, this is a crucial feature of the above definition that again allows us to apply the group-theoretic techniques developed above.

Theorem 6.6.7. Let P = (Γ1,..., Γk) be a list of prototypes such that Γi ∈ Γbd for all i ∈ [k]. Let H1 = (V, E1, p1) and H2 = (V, E2, p2) be two P-labeled hypergraphs over the same vertex set V . Also let Γ ≤ Sym(V ) be a Γbd-group and γ ∈ Sym(V ). Finally let ∆ ≤ Sym(E1) be a Γbd-group and θ : E1 → E2 a bijection such that

IsoΓγ (H1, H2)[E1] ⊆ ∆θ.

Then a representation for the set of isomorphisms IsoΓγ (H1, H2) can be computed in time O((log d)c) (n + m) where n := |V |, m := |E1| and c is an absolute constant.

Proof. For i ∈ [2] and E ∈ Ei let pi(E)1 denote the first component of pi(E). First consider the strings x: Ei → [k]: E 7→ pi(E)1 0 0 and update ∆ θ = Iso∆θ(x1, x2). This can be done in the desired time by Theorem 6.5.4. Since any isomorphism from H1 to H2 has to respect the prototypes of the hyperedges it still holds that 0 0 IsoΓγ (H1, H2)[E1] ⊆ ∆ θ . 118 CHAPTER 6. ISOMORPHISM FOR BOUNDED DEGREE GRAPHS

0 Moreover, ∆ ∈ Γbd. Now let Ω = S [|E|] × {E} (which may be viewed as a disjoint union of all hyperedges i E∈Ei of Hi) and consider the group

0 n 0 γ σ σ Γ = γ ∈ Sym(Ω1) | ∃σ ∈ ∆ ∀E ∈ E1 ([|E|] × {E}) = [|E |] × {E } and

γ δ σ o ∃δ ∈ Γp1(E)1 ∀i ∈ [|E|]:(i, {E}) = (i , {E} )

0 0 00 Then Γ ∈ Γbd since ∆ ∈ Γbd and Γi ∈ Γbd for all i ∈ [k] using Lemma 5.1.9. Also let θ :Ω1 → Ω2 θ00 θ0 0 be the bijection that maps (i, E) to (i, E) = (i, E ). Consider the group Γ×Γ ≤ Sym(V ]Ω1) 00 00 and the bijection γ × θ : V ] Ω1 → V ] Ω2 obtained from combining γ and θ in the natural way. Define the graphs Gi, i ∈ [2], with V (Gi) = V ] Ωi and

τ 0 E(Gi) = {vα | v ∈ V, α = (j, E) ∈ Ω, v ∈ E and v = j where pi(E) = (j , τ).}

Then 0 00 ∼ IsoΓγ (H1, H2) = {σ ∈ (Γ × Γ )(γ × θ ) | σ : G1 = G2}[V1]. This can again be computed in the desired time by Theorem 6.5.4.

Considering the result of Miller [117] on the isomorphism problem of hypergraphs for Γbd- groups it is notable that Miller’s result, which is based on Luks’s algorithm, does not require the group ∆ that restricts the action on the set of hyperedges. Indeed, it is an interesting open question whether the group ∆ restricting the action on the set of hyperedges is actually required to build an algorithm with the same running time as in the theorem above. However, it is not even clear how to do this for standard hypergraphs in the desired time frame. Chapter 7

Isomorphism for Bounded Tree-Width Graphs

As a deeper application of the results presented in the last chapter, in the following, we give an improved fixed-parameter tractable1 isomorphism test for graphs of bounded tree-width. The first polynomial-time algorithm for the isomorphism problem for graphs of bounded tree-width was given by Bodleander running in time nO(k) [27]. This roughly matches the running time of an isomorphism test based on the Weisfeiler-Leman algorithm as discussed in Section 3.1. More recently, Schweitzer and Elberfeld presented a logspace isomorphism test for graphs of bounded tree-width [48] and Lokshtanov, Pilipczuk, Pilipczuk and Saurabh proved that the isomorphism problem is fixed-parameter tractable when parameterized by the tree-width of the input graphs 5 [104]. The algorithm of Lokshtanov et al. runs in time 2O(k log k) poly(n) where k is the tree- width of the input graphs and n denotes the number of vertices. In this chapter, we give an improved algorithm running in time 2k polylog(k) poly(n). The key ingredient for achieving this improvement is that certain parts of graphs of small tree-width can be described by small-degree structures enabling us to exploit the methods developed in the previous chapter. The results presented in this chapter can also be found in [70].

7.1 Isomorphism-Invariant Decompositions

7.1.1 Idea

In order to design algorithms that are fixed-parameter tractable in the tree-width of the input graphs, the standard strategy is to perform dynamic programming following a tree decomposition of the input graphs (see, e.g., [46]). However, the main obstacle for designing isomorphism tests is that it is not possible to use off-the-shelf algorithms for computing tree-decompositions of small width. Indeed, in this case, the two input graphs may be decomposed in structurally different ways, which means that, while the input graphs may be isomorphic, an isomorphism may not extend to the tree decompositions and therefore, it is not found by a dynamic programming algorithm. Hence, in order to circumvent this problem, it is imperative to decompose the graphs in an isomorphism-invariant fashion (cf. Example 2.1.4) so as not to compare two graphs that have been decomposed in structurally different ways.

1For background on parameterized complexity theory I refer to [46, 53].

119 120 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

A prime example of this strategy is Bodlaender’s isomorphism test [27] for graphs of bounded tree-width. Bodlaender’s algorithm is a dynamic programming algorithm that takes into account all k-tuples of vertices that separate the graph which leads to a running time of O(nk+c) to test isomorphism of graphs of tree-width at most k. A much more refined strategy decomposing the input graphs in an isomorphism-invariant fashion is taken by Lokshtanov, Pilipczuk, Pilipczuk, and Saurabh [104] resulting in the first fpt algorithm for the isomorphism problem parameterized by the tree-width of the input graphs 5 running in time 2O(k log k) poly(n). This algorithm first improves a given input graph G to a graph Gk by adding an edge between every pair of vertices between which more than k pair- wise internally vertex disjoint paths exist. The improved graph Gk isomorphism-invariantly decomposes along clique separators into clique-separator free parts, which are referred to as ba- sic graphs for the remainder of this chapter. The decomposition can in fact be extended to an isomorphism-invariant tree decomposition into basic parts, as was shown in [48] to design a logspace isomorphism test for graphs of bounded tree-width. For the basic graphs, Lokshtanov et al. [104] show that, after fixing a vertex of sufficiently low degree, is it possible to compute an isomorphism-invariant tree decomposition whose bags have a size at most exponential in k and whose adhesion-width is at most O(k3). In this chapter, the framework introduced by Lokshtanov et al. is extended essentially in two directions looking at the decomposition of the basic graphs. First, we provide an improved version of the decomposition of the basic graphs that achieves adhesion-width O(k2). But much more importantly, we also prove that the structure of each bag of the decomposition of the basic graphs can be described by a graph of small degree. More precisely, for each bag of the decomposition, we additionally define an isomorphism-invariant structure graph of small degree. This allows us to apply the algorithms developed in the last chapter in order to give a faster algorithm testing isomorphism of the single bags of the decomposition. For this chapter, let n denote the number of vertices of the input graphs. In order to formalize the above outline some additional notation regarding tree decompositions and graphs separators is required. Towards this end, recall the basic definitions provided in Subsection 2.1.2. First, to design dynamic programming algorithms along a given tree decomposition, it is more convenient to work with rooted decompositions which are equipped with a designated root node. Definition 7.1.1 (Rooted Tree Decomposition). Let G be a graph. A rooted tree decomposition is a tuple ((T, t0), β) where (T, t0) is a rooted tree and (T, β) is a tree decomposition of G. Naturally, all properties and notations given for standard tree decompositions translate to rooted tree decompositions. The root bag of a rooted tree decomposition ((T, t0), β) is the set β(t0). Also, the following generalized notations on separators in graphs are required. Let G be a graph. A separation is a pair (A, B) such that A ∪ B = V (G), A \ B 6= ∅, B \ A 6= ∅, and there are no edges between the sets A \ B and B \ A. This means A ∩ B is a separator of G. For sets L, R ⊆ V (G) an (L, R)-separation is a separation (A, B) such that L ⊆ A and R ⊆ B. In this case the set A ∩ B is an (L, R)-separator. Finally, a separator S ⊆ V (G) is a clique separator if G[S] is a complete graph.

7.1.2 Clique Separators The first part of the algorithm, that decomposes the input graphs into the basic parts, follows the algorithm by Lokshtanov et al. [104]. Definition 7.1.2 (Improved and Basic Graphs, Lokshtanov et al. [104]). Let G be a graph. The k-improvement of G is the graph Gk obtained from G by adding an edge between every pair 7.1. ISOMORPHISM-INVARIANT DECOMPOSITIONS 121 of non-adjacent vertices v, w for which there are strictly more than k pairwise internally vertex disjoint paths connecting v and w. A graph G is k-improved if Gk = G. A graph is k-basic if it is k-improved and does not have any separating cliques. In particular, a k-basic graph is 2-connected. We summarize several structural properties of Gk.

Lemma 7.1.3 ([104]). Let G be a graph and k ∈ N. 1. The k-improvement Gk is k-improved, i.e., (Gk)k = Gk. 2. Every (rooted) tree decomposition (T, β) of G of width at most k is also a (rooted) tree decomposition of Gk. 3. There exists an algorithm that, given G and k, runs in O(k2n3) time and either correctly concludes that tw(G) > k, or computes Gk. Since the construction of Gk from a graph G is isomorphism-invariant, the concept of the improved graph can be exploited for isomorphism testing. A k-basic graph has severe limitations concerning its structure which is explored in the following sections. In order to reduce to k-basic graphs a result of Leimer [99] is exploited stating that every graph has a tree decomposition into clique-separator free parts, and the family of bags is isomorphism-invariant. While it is usually sufficient to work with an isomorphism-invariant set of bags (see [128]), we actually require an isomorphism-invariant decomposition, which can indeed be obtained. Theorem 7.1.4 ([99],[48]). There is an algorithm that, given a connected graph G, computes a rooted tree decomposition ((T, t0), β) of G, called clique separator decomposition, such that 1. for every t ∈ V (T ) the graph G[β(t)] is clique-separator free (and in particular connected), 2. each adhesion set of (T, β) is a clique, 3. |V (T )| ∈ O(n), and 4. for each bag β(t) the adhesion sets of the children are all equal to β(t) or the adhesion sets of the children are all distinct. The algorithm runs in polynomial time and the output of the algorithm is isomorphism-invariant with respect to G. In the final algorithm this decomposition is applied to the k-improvement of the input graphs which decomposes the inputs graphs into k-basic graphs.

7.1.3 Decomposition of Basic Graphs Having decomposed a graph of tree-width k into k-basic graphs in the previous subsection, this subsection deals with finding bounded-width rooted tree decompositions for k-basic graphs. Crucially, these decompositions are isomorphism-invariant after fixing one vertex of the graph. The construction presented in this section refines a similar construction of [104]. In order to describe the construction the following three parameters cS, cM, and cL (small, medium and large) depending on k are defined:

 c 2 c := 6(k + 1), c := c + c (k + 1), c := c + 2(k + 1) M . S M S S L M k + 2 122 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

2 O(k log k) Observe that cS = O(k), cM = O(k ) and cL = 2 . The interpretation of these parameters is that cM bounds the size of the adhesion sets whereas cL serves as bound for the bag size. The parameter cS is used by the algorithm which in certain situations behaves differently depending on sets being larger than cS or not. 2 3 The bound cM = O(k ) on the adhesion-width improves the corresponding bound O(k ) from [104]. However, the more relevant extension of the construction from [104] is that in addition each bag of the tree decomposition is labeled with an isomorphism-invariant structure graph of small forward degree. Similar to structure graphs for permutation groups (see Definition 6.1.7) the intention of these graphs is to capture structural information of the underlying bag. The structure graphs having small degree allows us to apply the methods developed in the previous chapter for isomorphism testing of graphs of bounded degree (and related types of structures). In order to be able to construct the desired decomposition along with the structure graphs, we first need to prove an auxiliary lemma. Let G be a graph and let w : V (G) → N be a weight P function for the vertices. The weight of a set S ⊆ V (G) with respect w is v∈S w(v). The weight of a separation (A, B) is the weight of its separator A ∩ B. For sets L, R ⊆ V (G), among all (L, R)-separations (A, B) of minimal weight there exists a unique separation with an inclusion minimal A. For this separation the separator A∩B is called leftmost minimal separator and it is denoted by SL,R(w). Moreover, define SL,R := SL,R(1) where 1 denotes the function that maps every vertex to 1. For U ⊆ V (G) let wU,k : V (G) → N be the weight function such that wU,k(u) = k for all u ∈ U and wU,k(v) = 1 for all v ∈ V (G) \ U. Using Menger’s theorem and the Ford-Fulkerson algorithm one can efficiently compute SL,R(w) in time O((n + m) · w(SL,R(w))) for any weight function w. The following lemma generalizes [104, Lemma 3.2].

Lemma 7.1.5. Let G be a graph and S ⊆ V (G) be a subset of vertices. Let t ∈ Z≥0, Li,Ri ⊆ S and wi : V (G) → N be weight functions for i ∈ [t]. Also let Ti ⊆ V (G) be a minimum weight (Li,Ri)-separator with respect to wi. Let w : V (G) → N be another weight function such that

1. w(v) = wi(v) for all v ∈ V (G) \ S, and

2. w(v) ≥ wi(v) for all v ∈ V (G) S for all i ∈ [t]. Let D := S ∪ i∈[t] Ti. Suppose that Z is the vertex set of any connected component of G − D. Then w(N(Z)) ≤ w(S). Proof. The proof is done by induction on t. For t = 0 it holds that N(Z) ⊆ D = S which means 0 S the statement is trivial. So assume t ≥ 1 and define D := S ∪ i∈[t−1] Ti. Let Z be the vertex set of a connected component of G−D, and let Z0 ⊇ Z be the vertex set of the connected component 0 of G−D containing Z. Let (A, B) be an (Lt,Rt)-separation with separator A∩B = Tt. Without loss of generality suppose that Z ⊆ A \ B. The three sets A \ Tt, Tt, and B \ Tt partition the vertex set V (G). Similarly the three sets V \ (Z ∪ N(Z0)), N(Z0) and Z0 partition V (G). We define Qi,j to be the intersection of the i-th set of the first triple with the j-th set of the second triple. This way the sets Qi,j with i, j ∈ [3] partition V (G) into 9 parts as visualized in Figure 7.1. 0 0 By the induction hypothesis it holds w(N(Z )) ≤ w(S). Since N(Z ) = Q1,2 ∪Q2,2 ∪Q3,2 and N(Z) ⊆ Q1,2 ∪Q2,2 ∪Q2,3, it suffices to show w(Q2,3) ≤ w(Q3,2). Observe that Q2,1 ∪Q2,2 ∪Q3,2 is also an (Lt,Rt)-separator, because Lt ⊆ A and Rt ⊆ B by the choice of (A, B), and Rt ⊆ S ⊆ V (G) \ Z0. Since Tt is a minimum weight separator with respect to wi, it follows wt(Tt) ≤ wt(Q2,1 ∪ Q2,2 ∪ Q3,2), and as Tt = Q2,1 ∪ Q2,2 ∪ Q2,3, this implies wt(Q2,3) ≤ wt(Q3,2). Hence,

1 2 w(Q2,3) = wt(Q2,3) ≤ wt(Q3,2) ≤ w(Q3,2). 7.1. ISOMORPHISM-INVARIANT DECOMPOSITIONS 123

Q1,1 Q2,1 Q3,1

V \ (Z0 ∪ N(Z0))

S

Q1,2 Q2,2 Q3,2

N(Z0)

N(Z)

Q1,3 Q2,3 Q3,3

Z Z0

A \ Tt Tt B \ Tt

Figure 7.1: Visualization of the sets appearing in the proof of Lemma 7.1.5.

This lemma is used to extend a set of vertices S that is not a clique separator to a larger set D in an isomorphism-invariant fashion while controlling the size of the adhesion sets of the components of G − D. Additionally, it is possible to construct a graph of small degree that captures structural information of the set D. Recall the definition of a rooted simple acyclic graph (G, v0) (see Subsection 6.1.2). Let G be a graph and let D ⊆ V (G). A structure graph for D is a rooted simple acyclic graph + (H, v0) such that L(H) = D. A structure graph (H, v0) is d-ary if deg (v) ≤ d for all v ∈ V (H). The structure graphs considered in this chapter are all defined in an isomorphism-invariant way with respect to the graph G and the set D. This means, for graphs G1 and G2, vertex sets 1 2 D1 ⊆ V (G1) and D2 ⊆ V (G2), and structure graphs (H1, v0) and (H2, v0), for each isomorphism ∼ 1 ∼ 2 ϕ: G1 = G2 such that ϕ(D1) = D2 there is an isomorphism ψ :(H1, v0) = (H2, v0) such that ϕ(v) = ψ(v) for all v ∈ D (see Subsection 2.1.4).

Lemma 7.1.6. Let k ∈ N. Let G be a graph that is k-improved and suppose ∅ ( S ( V (G) such that

1. |S| ≤ cM and S is not a clique,

2. G − S is connected and S = N(V (G) \ S).

Then there is an algorithm that either correctly concludes that tw(G) > k or finds a proper superset D ) S and a structure graph (H, v0) for D such that

(A) |D| ≤ cL,

(B) if Z is the vertex set of any connected component of G − D, then |N(Z)| ≤ cM, 124 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

2 O(k log k) (C) H is a d-ary structure graph where d ≤ cM + k + 1 and |V (H)| = 2 . The running time of the algorithm is 2O(k log k)nc for some constant c, and the pair (D,H) is isomorphism-invariant with respect to the input (G, S, k). Proof. Consider the following two cases depending on the size of S.

Case |S| ≤ cS: Let I := {({x}, {y}) | x, y ∈ S, x 6= y, xy∈ / E(G)} and define [ D := S ∪ SL,R(wL∪R,k+1). (L,R)∈I For every xy∈ / E(G) there is an ({x}, {y})-separator of size at most k disjoint from {x, y}, 2 2 because G is k-improved. Thus, |D| ≤ |S| + k|S| ≤ cS + kcS ≤ cL. Moreover, since G − S is connected and S = N(V (G) \ S), for all distinct x, y ∈ S every minimum weight ({x}, {y})-separator contains a vertex that is not in S. This implies D ) S.

To argue that D satisfies Condition (B) set w := wS,k+1. Then, for every vertex set Z of a connected component of G − D, it holds that

|N(Z)| ≤ w(N(Z)) ≤ w(S) ≤ cS(k + 1) ≤ cM by Lemma 7.1.5.

It remains to define the structure graph (H, v0) for the set D. In this case this is particularly easy since D is small. Let H = ({v0}] D, {(v0, v) | v ∈ D}) which is obviously a structure + 2 2 2 O(k log k) graph for D. Also deg (v0) ≤ |D| ≤ cS + kcS ≤ (cS(k + 1)) ≤ cM and |V (H)| = 2 .

Case cS < |S| ≤ cM: Let I := {(L, R) | L ⊆ S, R ⊆ S, |L| = |R| ≤ k + 2, |SL,R| ≤ k + 1} and define [ D := S ∪ SL,R. (L,R)∈I Then k+2  2  2 X cM cM |D| ≤ c + (k + 1)|I| ≤ c + (k + 1) ≤ c + 2(k + 1) = c . M M i M k + 2 L i=1 Moreover, for every vertex set Z of a connected component of G−D, it holds that |N(Z)| ≤ |S| ≤ cM by Lemma 7.1.5. The fact that I is nonempty follows from the existence of a balanced separation (for details see [104]).

Again, it remains to define the structure graph (H, v0) for D. Let V (H) := {(L, R) | L, R ⊆ S, |L| = |R| ≤ k + 2} ∪ D

and

E(H) := {((L, R)(L0,R0)) | L ⊆ L0,R ⊆ R0, |L| + 1 = |R| + 1 = |L0| = |R0|}

∪ {((L, R), v) | (L, R) ∈ I, v ∈ SL,R}.

Pk+2 cM O(k log k) Also let v0 := (∅, ∅) be the root of H. Then |V (H)| ≤ cL + `=0 ` = 2 . Also, for each pair (L, R) ∈ V (H) there are at most |S|2 different extensions (L0,R0) with |L| + 1 = |R| + 1 = |L0| = |R0| and moreover, assuming (L, R) ∈ I, it holds that + 2 2 |SL,R| ≤ k + 1. It follows that deg (L, R) ≤ |S| + k + 1 ≤ cM + k + 1. Finally, to turn H into a structure graph, repeatedly delete all leaves v ∈ L(H) \ D. 7.1. ISOMORPHISM-INVARIANT DECOMPOSITIONS 125

For bounding the running time of the algorithm observe that, in both cases, |I| = 2O(k log k) and each set SL,R can be computed in polynomial time by the comments before Lemma 7.1.5. Also note that the structure graph can be computed in time polynomial in its size.

A labeled tree decomposition ((T, t0), β, η) is a 3-tuple where ((T, t0), β) is a rooted tree de- composition and η is a function that maps each t ∈ V (T ) to a structure graph η(t) for β(t). The last lemma can be used as a recursive tool to compute the desired isomorphism-invariant rooted tree decomposition where each bag is equipped with a structure graph. Recall that n denotes the number of vertices of the input graph G.

Theorem 7.1.7. Let k ∈ N. Also let G be a k-basic graph and v ∈ V (G) such that deg(v) ≤ k. There is an algorithm that either correctly concludes that tw(G) > k, or computes a labeled tree decomposition ((T, t0), β, η) such that

(I) the width of ((T, t0), β) is bounded by cL,

(II) the adhesion-width of ((T, t0), β) is bounded by cM,

2 (III) the degree of (T, t0) is bounded by kcL and the number of children of t with common adhesion set is bounded by k for each t ∈ V (T ), (IV) |V (T )| ≤ 2n, (V) for each bag β(t) the adhesion sets of the children are all equal to β(t) or the adhesion sets of the children are all distinct. In the former case the bag size is bounded by cM, and

t (VI) for each t ∈ V (T ) the graph η(t) = (Ht, v0) is a d-ary structure graph for β(t) where 2 O(k log k) d ≤ cM + k + 1 and |V (H)| = 2 . O(k log k) c The algorithm runs in time 2 n for some constant c and the output ((T, t0), β, η) of the algorithm is isomorphism-invariant with respect to the input (G, v, k).

Here, the output ((T, t0), β, η) is isomorphism-invariant with respect to (G, v, k), if the rooted tree decomposition ((T, t0), β) is isomorphism-invariant with respect to (G, v, k) and, for each v ∈ V (T ), the structure graph η(t) isomorphism-invariant with respect to (G, v, k, β(t)). Proof. The algorithm works recursively getting as input a pair (G, S) where G is a k-basic graph and ∅ ( S ( V (G) such that

1. |S| ≤ cM, 2. S is not a clique or |N(S)| ≤ k, 3. G − S is connected, and 4. S = N(V (G) \ S).

The algorithm returns a labeled tree decomposition ((T, t0), β, η) such that S ⊆ β(t0) and Con- ditions (I) - (VI) are satisfied. Since G has no separating cliques and therefore v is not a cut vertex, the statement of the theorem follows by setting S = {v}. The algorithm works as follows. If S is a clique define D := S ∪ N(S). In this case let H := ({w0} ∪ D, {(w0, v) | v ∈ D}). Then (H, w0) is a structure graph for D. In the other case the algorithm applies Lemma 7.1.6 computing a set D ) S and a structure graph (H, v0) for D. In both cases it holds that 126 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

(A) |D| ≤ cL,

(B) if Z is the vertex set of any connected component of G − D, then |N(Z)| ≤ cM,

2 O(k log k) (C) H is a d-ary structure graph where d ≤ cM + k + 1 and |V (H)| = 2 , and the set D and the structure graph H are defined in an isomorphism-invariant manner with respect to (G, S, k). Now let Z1,...,Z` be the connected components of the graph G − D. Also let Si := N(Zi) and Gi := G[Zi ∪ Si]. Clearly, the pair (Gi,Si) satisfies Conditions 1, 3 and 4 for all i ∈ [`]. It remains to argue that Si is not a clique. If Si is a separator of G, this immediately follows from the fact that G is clique-separator free. In the other case Si = D and moreover, S is not a clique since N(S) ⊆ D. Hence, D = Si is not a clique. So the pair (Gi,Si) also satisfies Condition 2 for every i ∈ [`]. Now the algorithm is applied recursively for all pairs (Gi,Si), i ∈ [`], resulting in labeled tree i decompositions ((Ti, t0), βi, ηi). By possibly renaming the vertices of the trees it can be assumed that V (Ti) ∩ V (Tj) = ∅ for all i, j ∈ [`]. Now let ((T, t0), β, η) be the labeled tree decomposition i obtained from the disjoint union of ((Ti, t0), βi, ηi), i ∈ [`], by adding the root node t0 which is i connected to t0 for all i ∈ [`]. Also set β(t0) := D and η(t0) := H. The isomorphism-invariance of the output of the algorithm is clear from the above description as all steps are performed in an isomorphism-invariant manner. For the running time first analyze the number of recursive calls. Each bag β(t), t ∈ V (T ), can be associated with the set A(t) := D \ S. Then A(t) 6= ∅ and A(t) ∩ A(s) = ∅ for all s, t ∈ V (T ). This implies |V (T )| ≤ n. Since each recursive call adds at least one more bag to the decomposition the number of recursive calls is bounded by n. So the bound on the running time immediately follows from Lemma 7.1.6. For the correctness of the algorithm Properties (I), (II) and (VI) immediately follow from the description above. For Property (III) recall that G is k-improved and therefore each non-edge in D is contained in at most k different Si. Also recall that Si is not a clique and therefore, each Si contains at least one non-edge. Consider Property (V). For every bag violating this property an additional bag can be introduced containing all equal adhesion sets. This operation at most doubles the number of bags which implies Property (IV). Also note that all other properties are preserved by this operation.

This theorem also gives some insight into the structure of the automorphism group of a k-basic graph after individualizing a single vertex v of degree at most k. Since each bag is equipped with a d-ary structure graph it follows that the automorphism group restricted to each bag forms a Γbd-group. Applying this argument to each bag in a top-down fashion it follows that Aut(G, v) ∈ Γbd. Actually, using a slightly refined argument, one can prove the following restriction on the automorphism group of k-basic graphs.

Proposition 7.1.8 (Grohe, N., Schweitzer, Wiebking [70]). Let k ∈ N. Also let G be a k-basic graph of tree-width at most k and v ∈ V (G) a vertex of degree at most k. Then Aut(G, v) ∈ Γbk+1.

7.2 Isomorphism Testing using Dynamic Programming

Having computed isomorphism-invariant rooted tree decompositions in the last section we now want to compute the set of isomorphisms from one graph to another in a bottom up fashion using dynamic programming. Let G1,G2 be the two input graphs and suppose we are given 1 2 isomorphism-invariant rooted tree decompositions ((T1, t0), β1) and ((T2, t0), β2). For a node t ∈ V (Ti) we let (Gi)t be the graph induced by the union of all bags contained in the subtree 7.2. ISOMORPHISM TESTING USING DYNAMIC PROGRAMMING 127

0 rooted at t. The basic idea is to compute for all pairs t ∈ V (T1), t ∈ V (T2) the set of isomor- phisms from (G1)t to (G2)t0 (in addition the isomorphisms shall also respect the underlying tree decomposition) in a bottom up fashion. The first step is to give an algorithm that solves this problem at a given bag (assuming we have already solved the problem for all pairs of children of t and t0). Let us first give some 0 intuition for this task. Suppose we are looking for all bijections from β1(t) to β2(t ) that can be extended to an isomorphism from (G1)t to (G2)t0 . Let t1, . . . , t` be the children of t and 0 0 0 t1, . . . , t` the children of t . Then we essentially have to solve the following two problems. First, 0 we have to respect the edges appearing in the bags β1(t) and β2(t ). But also, every adhesion 0 0 set β(t) ∩ β(ti) has to be mapped to another adhesion set β(t ) ∩ β(tj) in such a way that the corresponding bijection (between the adhesion sets) extends to an isomorphism from (G1)ti to (G ) 0 . A simple approach to solve this problem is to use the Weisfeiler-Leman algorithm with a 2 tj sufficiently large dimension which depends on the adhesion-width of the decomposition. Indeed, in order to encode the isomorphism type of a subgraph (G ) (resp. (G ) 0 ) is suffices to color all 1 ti 2 tj 0 0 vertex-tuples defined over the adhesion set β(t)∩β(ti) (resp. β(t )∩β(tj)) accordingly. Then, the isomorphism problem for the bag can be solved using the Weisfeiler-Leman algorithm employing similar arguments as in Section 3.1. Note that the algorithm itself also colors the vertex tuples of the adhesion set to the parent node accordingly as required for the next step of the dynamic programming procedure (see also [128]). However, there is also a more clever and efficient approach solving this problem utilizing the structure graphs of small degree defined on each bag. Towards this end, first consider the case in which the adhesion sets are all distinct. Then the problem described above can be formulated as an isomorphism problem of coset-labeled hypergraphs considered in Subsection 6.6.2. Of course, to be able to apply Theorem 6.6.7, we first need to find Γbd-groups that restrict the actions on the vertices and the adhesion sets. For the vertices, this is easy to achieve by computing the automorphism group of the structure graph of the bag computed in Theorem 7.1.7. Hence, it remains to find such a group for the adhesion sets. Observe that there are two types of adhesion sets that need to be considered in the overall dynamic programming algorithm: clique separators that originate from the top-level decomposition into the k-basic graphs and the adhesion sets from the second-level decomposition of the k-basic graphs. The next lemma extends a structure graph for a set of vertices in a k-basic graph to such adhesion sets. Actually, the lemma only computes the corresponding Γbd-groups required for the application of Theorem 6.6.7.

Lemma 7.2.1. Let k ∈ N. Let G1,G2 be k-improved graphs of tree-width at most k and suppose Dj ⊆ V (Gj) for j ∈ {1, 2}. Also suppose there is an isomorphism-invariant d-ary structure graph j (Hj, v0) for Dj where d ≥ k + 1. Then there is an algorithm computing the following objects:

1.a Γbd-group ∆ ≤ Sym(D1) and δ : D1 → D2 such that Iso((G1,D1), (G2,D2))[D1] ⊆ ∆δ,

2.a Γbd-group Λ ≤ Sym(C1) and λ: C1 → C2 where Ci is the set of all cliques Ci ⊆ Di such that Iso((G1,D1), (G2,D2))[C1] ≤ Λλ, and

3.a Γbd-group Θ ≤ Sym(S1) and θ : S1 → S2 where Sj is the set of all NGj (Zj) where Zj is a connected component of Gj − Dj such that Iso((G1,D1), (G2,D2))[S1] ≤ Θθ. k O((log d)c) 2 The algorithm runs in time (2 nH ) · n where n := max{|V (G1)|, |V (G2)|} denotes the size of the input graph, nH := max{|V (H1)|, |V (H2)|} the size of the structure graphs and c is an absolute constant. j Proof. Since (Hj, v0) is defined isomorphism-invariantly it follows that 1 2 Iso((G1,D1), (G2,D2))[D1] ⊆ Iso((H1, v0), (H2, v0))[D1]. 128 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

1 2 Let ∆ ≤ Sym(D1) and δ : D1 → D2 such that ∆δ = Iso((H1, v0), (H2, v0))[D1]. Then ∆ ∈ Γbd by O((log d)c) Lemma 6.6.3 and 5.1.8. Moreover, both objects can be computed in time nH for some constant c by Theorem 6.6.2. (0) (i) (i−1) (i) (i) For the second item let Gj := Gj[Dj] and for i > 0 let Gj := Gj [Vj ] where Vj := {v ∈ (i−1) (0) V (Gj ) | deg (i−1) (v) > k}. Since Gj has tree-width at most k it follows that there is some Gj ∗ ∗ (i ) i > 0 such that Gj is the empty graph (cf. Lemma 2.1.3). For every clique Cj ⊆ Dj define (i) (i(Cj )+1) i(Cj) to be the maximal i ∈ N such that Cj ⊆ V (Gj ). Moreover, let a(Cj) := Cj \V (Gj ).

Observe that a(Cj) 6= ∅ and for every v ∈ a(Cj) it holds deg (i(Cj )) (v) ≤ k and Cj ⊆ N (i(Cj )) [v]. Gj Gj (i) For v ∈ Dj let i(v) be the maximal i ∈ N such that v ∈ V (Gj ). Let A(v) := N (i(v)) [v]. Gj Note that |A(v)| ≤ k + 1. For a set S we define the rooted simple acyclic graph (LS, ∅) to be the graph associated with S the subset lattice of S, that is, LS = (2 ,ES, ∅) where (A, B) ∈ ES if A ⊆ B and |B \ A| = 1. 0 j Now we define the rooted simple acyclic graph (Hj, v0) be the following extension of (Hj, v0). For every v ∈ Dj attach the a copy of the graph LA(v) to the vertex v ∈ V (Hj), that is, the 0 j there is an edge from v to the root node ∅ ∈ V (LA(v)). Since d ≥ k + 1 the graph (Hj, v0) is 0 still a d-ary rooted simple acyclic graph. Moreover, Hj is defined isomorphism-invariantly and 0 k+1 |V (Hj)| ≤ 2 nH . Observe that Cj ⊆ A(v) for every v ∈ a(Cj) where Cj is a clique of Gj[Dj]. 0 Hence, Cj ⊆ V (Hj). Overall, this means

0 1 0 2 Iso((G1,D1), (G2,D2))[C1] ⊆ Iso((H1, v0), (H2, v0))[C1].

0 1 0 2 For Λλ = Iso((H1, v0), (H2, v0))[C1] it holds that Λ ∈ Γbd by Lemma 6.6.3 and both objects can k O((log d)c) be computed in time (2 nH ) for some constant c by Theorem 6.6.2. It remains to consider the third item. Let v, w ∈ Dj be distinct elements such that vw∈ / E(Gj). Since Gj is k-improved it holds that Sv,w := {S ∈ Sj | v, w ∈ S} ≤ k. Using the second item it suffices to consider sets S ∈ S such that S is not a clique in G. So without loss of generality S = S S . j v,w∈Dj : xy∈ /E(Gj ) v,w 00 j j We define another rooted simple acyclic graph (Hj , (v0, v0)) with

00 2 j j V (Hj ) := {(v, w, 0) ∈ V (Hj) | distHj (v0, v) = distHj (v0, w)} 2 j j ∪ {(v, w, 1) ∈ V (Hj) | distHj (v0, v) = distHj (v0, w) + 1} ∪ Sj and

00 00 2 E(Hj ) := {((v1, w1, 0), (v2, w1, 1)) ∈ V (Hj ) | (v1, v2) ∈ E(Hj)} 00 2 ∪ {((v1, w1, 1), (v1, w2, 0)) ∈ V (Hj ) | (w1, w2) ∈ E(Hj)}

∪ {((v, w),S) | S ∈ Sv,w}.

00 Again, Hj is defined in an isomorphism-invariant manner with respect to (Gj,Dj). Moreover, 00 00 Hj is d-ary and Sj ⊆ V (Hj ). Thus,

00 1 1 00 2 2 Iso((G1,D1), (G2,D2))[S1] ⊆ Iso((H1 , (v0, v0)), (H2 , (v0, v0)))[S1] =: Θθ,

O((log d)c) and again Θ ∈ Γbd by Lemma 6.6.3 and both objects can be computed in time (knH ) for some constant c by Theorem 6.6.2. 7.2. ISOMORPHISM TESTING USING DYNAMIC PROGRAMMING 129

Looking at the properties of the tree decompositions computed in Theorem 7.1.4 and 7.1.7 we have for every node t that either the adhesion sets to the children are all equal or they are all distinct. Up to this point we only considered the problem that all adhesion sets are distinct (which is modeled by the isomorphism problem for coset-labeled hypergraphs). Thus, it still remains to discuss the case that all adhesion sets are equal. Towards this end consider the following modification of the isomorphism problem for coset- labeled hypergraphs. Let n ∈ N. Let P = (Γ1,..., Γ`) be a list of prototypes where each Γi, i ∈ [`], has degree n.A P-multi-labeled set of order t is a tuple H = (V, p) where V is a set of cardinality |V | = n and p is a function that maps each i ∈ [t] to a pair p(i) = (j, θ) where j ∈ [`] and θ : V → [n] is a bijection. Two P-multi-labeled sets H1 = (V1, p1) and H2 = (V2, p2) of order t are isomorphic if there are bijections σ : V1 → V2 and π :[t] → [t] such that π 1. p1(i) = (j, θ1), p2(i ) = (j, θ2) and there is some γ ∈ Γj such that

−1 σ = θ1γθ2 (7.1) for every i ∈ [t]. ∼ In this case σ is an isomorphism from H1 to H2, denoted by σ : H1 = H2. For Γ ≤ Sym(V1) and a bijection θ : V1 → V2 let ∼ IsoΓθ(H1, H2) := {σ ∈ Γθ | σ : H1 = H2}. (7.2)

Also, Iso(H1, H2) denotes the set of all isomorphisms from H1 to H2. As usual, an automorphism of H1 is an isomorphism from H1 to itself. Note that the set of automorphisms of a P-multi-labeled set forms a group and the set of isomorphisms between two P-multi-labeled set forms a coset of the automorphism group of the first object. Hence, as in previous chapters, the set IsoΓθ(H1, H2) admits representations of small size (i.e., polynomial in n).

Lemma 7.2.2. Let H1 = (V1, p1) and H2 = (V2, p2) be two P-multi-labeled sets of order t and suppose n = |V1| = |V2|. Then a representation for the set Iso(H1, H2) can be computed in time O(n! · (n + t)c) for some constant c.

Proof. The algorithms iterates through all possible bijections σ : V1 → V2. For i1 ∈ [t] let A(i1) be the set of all i2 ∈ [t] such that, for p1(i1) = (j, θ1) and p2(i2) = (j, θ2), there is some γ ∈ Γj −1 such that σ = θ1γθ2 . The set A(i1) can be computed in polynomial time by Theorem 5.1.4. Now consider the graph G = ([t] × {0, 1},E) where

E := {((i1, 0), (i2, 1)) | i2 ∈ A(i1)}. ∼ Then σ : H1 = H2 if and only if G contains a perfect matching. The latter can be checked in polynomial time in the size of G (see, e.g., [93]).

Lemma 7.2.3. Let H1 = (V1, p1) and H2 = (V2, p2) be two P-multi-labeled sets of order t and suppose n = |V1| = |V2|. Then a representation for the set Iso(H1, H2) can be computed in time c t! · t · 2O((log n) ) for some constant c. Proof. This time the algorithms iterates through all possible bijections π :[t] → [t]. For every π i ∈ [t] consider p1(i) = (j1, θ1) and p2(i ) = (j2, θ2). If j1 6= j2 the bijection π cannot be used to construct an isomorphism and can be ignored. So suppose j := j1 = j2. Let ∆i ≤ Sym(V1) and σi : V1 → V2 such that −1 −1 −1 ∆iσi = θ1Γjθ1 θ1θ2 = θ1Γjθ2 . 130 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

T Then each σ ∈ i∈[t] ∆iσi gives an isomorphism from H1 to H2. The intersection of the cosets can be computed within the desired time by Corollary 5.5.4. With this, we are finally ready to prove the central result of this chapter. Theorem 7.2.4. The Graph Isomorphism Problem for graphs of tree-width at most k can be c solved in time 2O(k·(log k) )nd for some constants c, d.

Proof. Let G1 and G2 be two graphs and let k ∈ N. We give a dynamic programming algorithm that either correctly determines that tw(Gj) > k for both j ∈ {1, 2} or correctly decides whether ∼ k k G1 = G2. First, the algorithm computes the k-improved graphs G1 and G2 using Lemma 7.1.3. 1 k 2 Let ((T1, t0), β1) be the clique separator decomposition of G1 and similarly, let ((T2, t0), β2) be k the clique separator decomposition of G2 (see Theorem 7.1.4). Note that for each tj ∈ V (Tj) k the graph Gj [βj(tj)] is k-basic. Also observe that the decompositions are defined isomorphism- ∼ invariantly, that is, every isomorphism from σ : G1 = G2 extends to an isomorphism between the clique separator decompositions. Now the algorithm works recursively as follows. The input is a pair of tuples

j Dj = (Gj,Sj, (Tj, t0), βj, ηj), j ∈ {1, 2}, where

(i) Gj is a graph and Sj ⊆ V (Gj),

j j (ii)(( Tj, t0), βj) is a rooted tree decomposition of Gj such that Sj ⊆ βj(t0), and

(iii) ηj is a partial function that assigns nodes tj ∈ V (Tj) a structure graph for β(tj).

A node tj ∈ V (Tj) is unlabeled if ηj(tj) is not defined, otherwise the node is labeled by η(tj). Additionally, the input pair (D1, D2) satisfies the following properties. For each unlabeled node tj ∈ V (Tj) it holds that k (U.1) the graph Gj [βj(tj)] is clique-separator free, and

k (U.2) each adhesion set of tj and also Sj is a clique in Gj .

Moreover, for each labeled node tj it holds that

(L.1) |βj(tj)| ≤ cL,

(L.2) |βj(tj) ∩ βj(sj)| ≤ cM for all children sj of tj, 2 O(k log k) (L.3) ηj(tj) is a d-ary structure graph for βj(tj) where d ≤ cM + k + 1 and |V (H)| = 2 ,

(L.4) for each bag βj(tj) the adhesion sets of the children are all equal to βj(tj) or the adhesion sets of the children are all distinct, and

k (L.5) if the adhesion sets are all equal and βj(tj) is no clique in Gj , then |βj(tj)| ≤ cM and the number of children of tj is bounded by k. ∼ The output of the recursive algorithm is the set Iso(D1, D2) of all isomorphisms σ : G1 = G2 that extend to isomorphisms between D1 and D2 (or the algorithm aborts in case tw(Gj) > k for both j ∈ {1, 2}). Applying the recursive procedure to the initially computed clique separator decompositions gives the set of all isomorphisms between the input graphs (setting Sj = ∅ and leaving all nodes unlabeled). The algorithm distinguishes two cases depending on whether the root nodes are labeled. (If one root node is labeled while the other one is not the objects are not isomorphic.) 7.2. ISOMORPHISM TESTING USING DYNAMIC PROGRAMMING 131

j k j Case t0, j ∈ {1, 2}, are both unlabeled: In this case Gj [βj(t0)] is a k-basic graph for both j ∈ 1 {1, 2} by Condition (U.1). Let v1 ∈ β1(t0) be a vertex of degree at most k in the k 1 graph G1 [β1(t0)] (note that such a vertex exists by Lemma 7.1.3 and 2.1.3). Moreover, 0 1 0 0 let ((T1, s0), β1, η1) be the labeled tree decomposition computed by Theorem 7.1.7 for k 1 (G1 [β1(t0)], v1). 2 k 2 For every v2 ∈ β2(t0) of degree at most k in G2 [β2(t0)] the algorithm performs the 0 2 0 0 following steps. First let ((T2, s0), β2, η2) be the labeled tree decomposition computed k 2 00 by Theorem 7.1.7 for (G2 [β2(t0)], v2). Next, the algorithm computes tuples Dj (vj) = 00 j 00 00 00 j j j (Gj,Sj, (Tj , r0), βj , ηj ). The rooted tree (Tj , r0) is obtained from (Tj, t0) by replacing t0 0 j j 0 with the tree (Tj, s0) and connecting each child tj of t0 to the highest node sj ∈ V (Tj) such that j 0 βj(tj) ∩ βj(t0) ⊆ βj(sj). j k Note that such a bag exists since βj(tj) ∩ βj(t0) is a clique in Gj by Condition (U.2) (cf. j 0 0 j Lemma 2.1.2). The root node r0 ∈ V (Tj) is the highest node such that Sj ⊆ β (r0). For j 00 00 0 tj ∈ V (Tj) \{t0} define β (tj) := β(tj) and η (tj) := η(tj) and for sj ∈ V (Tj) define 00 0 00 0 β (sj) := β (sj) and η (sj) := η (sj). 00 Clearly, the tuple Dj (vj) satisfies (L.1), (L.2) (L.3) and (L.5) by the properties of the decompositions computed by Theorem 7.1.7. Also, by possibly introducing additional bags, the decompositions can be easily modified in order to satisfy Condition (L.4). 00 00 2 Finally, the algorithm recursively computes Iso(D1 (v1), D2 (v2)) for all v2 ∈ β2(t0) of degree k 2 at most k in G2 [β2(t0)]. Then,

[ 00 00 Iso(D1, D2) = Iso(D1 (v1), D2 (v2)) 2 v2∈β2(t0)

since all operations are isomorphism-invariant (after choosing v1 and v2). j j j j Case t0, j ∈ {1, 2}, are both labeled: For j ∈ {1, 2} let t1, . . . , t` be the children of t0 in the j j j j tree (Tj, t0). Also, let (Tj,i, ti ) be the subtree of (Tj, t0) rooted at ti and let Vj,i := S β (t). Let β be the restriction of β to the vertex set V (T ). Note that t∈V (Tj,i) j j,i j j,i j j ((Tj,i, ti ), βj,i) is a rooted tree decomposition for Gj,i := G[Vj,i]. Let Sj,i := Vj,i ∩ β(t0). Moreover, let ηj,i be the restriction of ηj to the set V (Tj,i). Finally, define Dj,i := j (Gj,i,Sj,i, (Tj,i, ti ), βj,i, ηj,i). Now, using recursion, the algorithm computes the sets

Iso(D1,i1 ,D2,i2 )

for all i1, i2 ∈ [`]. ∗ ∗ First suppose all adhesion sets are pairwise distinct. Let D1,..., Dp be a list of isomorphism ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ types where Di = (Gi ,Si , (Ti , ri ), βi , ηi ). Also, without loss of generality, Si is an initial segment of the natural numbers for all i ∈ [p]. Consider a list of prototypes

∗ ∗ ∗ ∗ P := (Aut(D1)[S1 ],..., Aut(Ds )[Sp ], Sym([2]))

where the last prototype is used for the original edges of the graph G. Now define

j Hj := (βj(t0), Ej, pj) 132 CHAPTER 7. ISOMORPHISM FOR BOUNDED TREE-WIDTH GRAPHS

to be the P-labeled hypergraph where

j Ej := {Sj,i | i ∈ [`]} ∪ E(G[βj(t0)]) ∼ and pj associates each hyperedge with its prototype in the natural way. Then D1 = D2 ∼ if and only if H1 = H2. The latter can be decided by Theorem 6.6.7 using Lemma 7.2.1 j in order to construct the required Γbd-groups from the structure graphs ηj(t0), j ∈ {1, 2}. Note that |E| ≤ cM for all E ∈ Ej by Condition (L.2). ∗ ∗ In the other case all adhesion sets are equal. Again, let D1,..., Dp be a list of isomorphism ∗ ∗ ∗ ∗ ∗ ∗ ∗ ∗ types where Di = (Gi ,Si , (Ti , ri ), βi , ηi ). Also, without loss of generality, Si is an initial segment of the natural numbers for all i ∈ [p]. Consider a list of prototypes

∗ ∗ ∗ ∗ P = (Aut(D1)[S1 ],..., Aut(Ds )[Sp ], Sym([|Si,j| − 2]) × Sym([2]))

j j The strategy is similar to the previous case. If βj(t0) is a clique then |βj(t0)| ≤ k +1 (using j Lemma 2.1.2) and Lemma 7.2.2 can be applied. In the other case ` ≤ k and |βj(t0)| ≤ cM using Condition (L.5). Then Lemma 7.2.3 is applied.

This completes the description of the algorithm. It remains to analyze the running time. Using dynamic programming the number of recursive calls is polynomial in the number of vertices of G. In the case that tj , j ∈ {1, 2}, are both unlabeled the bound on the running time follows from 0 c Theorem 7.1.7. In the other case an application of Lemma 7.2.1 runs in time 2O(k log k(log d) )nc 2 4 for some constant c where d ≤ cM + k + 1 = O(k ) using Condition (L.3). Also, applying 2 O((log d)c) O(k(log k)c+1) Theorem 6.6.7 takes (cL + cL(k + 1)) = 2 . Finally, in case all adhesion sets are equal the bound on the running time follows from Lemma 7.2.2 and 7.2.3 using the bounds given above. Using similar techniques, together with Grohe, Schweitzer and Wiebking, the author of this thesis also presented a canonization algorithm for graphs of bounded tree-width. Proposition 7.2.5 (Grohe, N., Schweitzer, Wiebking [70]). There is a graph canonization for 2 graphs of tree-width at most k that can be computed in time 2O(k log k)nc for some constant c.

The slightly worse running time can be explained by the fact that, up to this point, there is no canonization algorithm for graphs of bounded degree matching the running time of the algorithm given in the previous chapter. Indeed, only very recently, Babai presented a non-trivial modification of his quasipolynomial- time isomorphism test [11] obtaining a graph canonization that can be computed in quasipoly- nomial time [12]. With these new techniques at hand, it seems reasonable to assume that one can also give a graph canonization for graphs of degree at most d that can be computed in time npolylog(d). And in turn, this result should imply a graph canonization algorithm for graphs of bounded tree-width whose running time matches the running time of the isomorphism test presented in this chapter. We leave both problems open in this thesis. Also, I remark that, building on some of the techniques presented in this chapter, Wiebking very recently presented an isomorphism test for graphs of tree-width at most k running in time npolylog(k) [150]. Note that, in the previous chapter, we proved such a result for the parameter d, the maximum degree of the input graphs (cf. Theorem 6.6.1). The key ingredient of Wiebking’s result is an algorithm solving the isomorphism problem for multi-labeled sets of order t in time (n + t)polylog(n) (where n denotes the size of the universe). Chapter 8

Discussion

In this thesis we have enhanced our understanding on the power of the most important algo- rithmic approaches to tackle the Graph Isomorphism Problem. In doing so, we obtained new algorithms for solving the isomorphism problem for classes of graphs of bounded maximum de- gree, tree-width and rank-width. Also, we obtained new lower bounds showing the limits of important approaches to the isomorphism problem which include the first exponential lower bounds on a large class of algorithms within the I/R framework. We conclude this thesis by discussing further research questions related to the Graph Isomorphism Problem.

The Weisfeiler-Leman Algorithm Let us start with the Weisfeiler-Leman algorithm. A common way to analyze the power of this algorithm is to determine the Weisfeiler-Leman dimension of single graphs or classes of graphs. This is also what we have done in this thesis providing upper and lower bounds on the Weisfeiler-Leman dimension of graphs of bounded tree-width and rank-width. Naturally, it would be desirable to further close these gaps. Indeed, while there are recent efforts to improve on existing bounds (see, e.g., [65, 89, 90]), there are very few natural classes of graphs for which the Weisfeiler-Leman dimension is known exactly. Related to that, there is also the question whether the Weisfeiler-Leman dimension of graphs can be determined algorithmically. In this direction, the graphs identified by the Color Refinement algorithm are characterized in [3, 91], and given any graph G, it can be tested in almost linear time whether G is identified by the 1-dimensional Weisfeiler-Leman algorithm. But already for the 2-dimensional Weisfeiler-Leman algorithm this problem becomes far more involved. Indeed, an important stumbling block is already our lack of understanding which strongly regular graphs are uniquely determined by their parameters. A first step towards understanding which graphs are identified by the 2-dimensional Weisfeiler- Leman algorithm is given in [55] solving the problem for graphs of maximum color class size four. Besides analyzing the Weisfeiler-Leman algorithm on graphs it would also be interesting to consider different types of objects. Indeed, an important example in this direction are groups. The Group Isomorphism Problem asks, given two groups by multiplication tables, whether they are isomorphic. This problem can be trivially solved in time nlog n+c for some constant c where n denotes the size of the group and no algorithms running in time no(log n) are known at this point. (An algorithm for the Group Isomorphism Problem running in time n1/2 log n + c is for example given in [135].) Indeed, with this problem being reducible to the Graph Isomorphism Problem, it forms an important hurdle for further improving the complexity of the Graph Isomorphism

133 134 CHAPTER 8. DISCUSSION

Problem. As for graphs, a natural combinatorial approach to group isomorphism is the Weisfeiler- Leman algorithm and so far, it is unknown whether the k-dimensional variant serves as a complete isomorphism test for groups for any fixed number k. Another, closely related candidate in this direction, which actually may be easier to under- stand, is the String Isomorphism Problem for general linear groups (see also [11, Section 5.2]). k Here, the task is, given two colorings χ1, χ2 : Fq → C where Fq denotes the q-element field, to k k determine whether there is a color-preserving linear transformation from (Fq , χ1) to (Fq , χ2). As before, it is unknown whether the k-dimensional Weisfeiler-Leman algorithm builds a complete isomorphism test for any fixed dimension k. Besides investigating the Weisfeiler-Leman dimension of certain objects, it is also worthwhile to consider what guarantees on the output of the k-dimensional Weisfeiler-Leman algorithm are achievable in general. Indeed, for his quasipolynomial time algorithm, Babai utilizes the Weisfeiler-Leman algorithm for dimension O(log n) for example to turn (after individualizing a small number of vertices) a relational structure of arity O(log n) into a graph in an isomorphism- invariant manner without introducing to many additional symmetries. Moreover, given a graph without to many symmetries, Babai further utilizes the Weisfeiler-Leman algorithm to extract (again, after individualizing a small number of auxiliary objects) a non-trivial coloring of the vertices, a non-trivial partition of the vertices, or a Johnson scheme in an isomorphism manner. In a similar direction, Sun and Wilmes [142, 143] proved that, with known exceptions, the 2- dimensional Weisfeiler-Leman algorithm either finds a non-trivial partition of the vertices (i.e., a non-trivial coloring of the vertices or non-trivial components with respect to some edge-color) or 1/3 one can individualize 2O(n polylog(n)) many vertices such that the Color Refinement algorithm completely splits the graph. Actually, replacing the combinatorial methods in Babai’s quasipoly- 1/3 nomial time algorithm, this result gives an isomorphism test running in time 2O(n polylog(n)).

FPT Algorithms Another important research direction stems from the question for which parameters the isomor- phism problem becomes fixed-parameter tractable. There are various results in this direction providing fixed-parameter tractable isomorphism tests for example for color multiplicity, eigen- value multiplicity, feedback vertex set number, etc. I refer to the Diploma Thesis of Fuhlbr¨uck [54] for an overview on some earlier results in this direction whereas the work of Schweitzer and Otachi gives various results related to width parameters [128]. However, for a long time, it remained open whether the isomorphism problem is fixed- parameter tractable for well-studied parameters such as tree-width, genus or bounded degree. In 2014, Lokshtanov, Pilipczuk, Pilipczuk and Saurabh presented the first fixed-parameter tractable isomorphism test parameterized by the tree-width of the input graphs (an improved algorithm for this problem is given in the previous chapter). Only shortly after, Kawarabayashi [88] announced a linear time isomorphism test for all graph classes of bounded genus. But still, several questions remain open. Probably the most striking open problem is whether the isomorphism problem is fixed-parameter tractable with respect to the maximum degree of the input graphs. Despite the progress made in this thesis on the complexity of the isomorphism problem for bounded degree graphs (see Chapter 6) this question is still wide open and it seems unlikely the techniques established in this thesis have an impact on this question. Another open question is whether isomorphism testing becomes fixed-parameter tractable when parameterized by the rank-width of the input graphs. Indeed, with improving the complexity of graph isomor- phism for graph classes of rank-width at most k from nf(k) for a non-elementary function f to nO(k) (see Section 3.2) the next natural step seems to be the quest for an fpt algorithm. Besides identifying parameters for which the isomorphism problem becomes fixed-parameter 135 tractable, another question is which dependence on the parameter can be achieved. For standard NP-complete problems one usually strives for fpt algorithms running in time 2O(κ) poly(n) where κ is the parameter under investigation. However, in light of Babai’s recent quasipolynomial time algorithm [11], one may hope for algorithms testing isomorphism in time 2polylog(κ) poly(n). For which parameters κ can the isomorphism problem be solved in this time? Note that a first example is already given by Babai in his breakthrough paper [11] in the color multiplicity of the input graphs. In particular, one can ask whether the isomorphism problem parameterized by tree-width k can be solved in time 2polylog(k) poly(n). As an intermediate step, Wiebking very recently presented an isomorphism test running in time npolylog(k) for input graphs of tree-width at most k [150].

Hardness Results Finally, a topic that was not covered in this thesis is the quest for lower bounds on the complexity of the Graph Isomorphism Problem. An important question in this direction raised by Jacobo Tor´anis whether the Graph Isomorphism Problem is hard for the complexity class PTIME under LOGSPACE many-one reductions [4]. Up to this point, the best known result in this direction is that the Graph Isomorphism Problem is hard under AC0-reductions for the complexity class DET which contains all problems NC1-reducible to the determinant [145]. 136 CHAPTER 8. DISCUSSION Bibliography

[1] Mikl´osAjtai. Recursive construction for 3-regular expanders. Combinatorica, 14(4):379– 416, 1994.

[2] Stefan Arnborg, Derek G. Corneil, and Andrzej Proskurowski. Complexity of finding embeddings in a k-tree. SIAM J. Algebraic Discrete Methods, 8(2):277–284, April 1987.

[3] Vikraman Arvind, Johannes K¨obler,Gaurav Rattan, and Oleg Verbitsky. Graph iso- morphism, color refinement, and compactness. Computational Complexity, 26(3):627–685, 2017.

[4] Vikraman Arvind and Jacobo Tor´an.Isomorphism testing: Perspective and open problems. Bulletin of the EATCS, 86:66–84, 2005.

[5] Albert Atserias and Elitza N. Maneva. Sherali-Adams relaxations and indistinguishability in counting logics. SIAM J. Comput., 42(1):112–137, 2013.

[6] L´aszl´oBabai. On the isomorphism problem. Unpublished manuscript cited in [110], 1977.

[7] L´aszl´oBabai. Lectures on graph isomorphism. University of Toronto, Department of Computer Science, October 1979.

[8] L´aszl´oBabai. Monte Carlo algorithms in graph isomorphism testing. Technical Report 79-10, Universit´ede Montr´eal,1979.

[9] L´aszl´oBabai. On the order of uniprimitive permutation groups. Ann. of Math. (2), 113(3):553–568, 1981.

[10] L´aszl´oBabai. Graph isomorphism in quasipolynomial time. CoRR, abs/1512.03547v2, 2015.

[11] L´aszl´oBabai. Graph isomorphism in quasipolynomial time [extended abstract]. In Daniel Wichs and Yishay Mansour, editors, Proceedings of the 48th Annual ACM SIGACT Sym- posium on Theory of Computing, STOC 2016, Cambridge, MA, USA, June 18-21, 2016, pages 684–697. ACM, 2016.

[12] L´aszl´oBabai. Canonical form for graphs in quasipolynomial time: preliminary report. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019., pages 1237–1246. ACM, 2019.

[13] L´aszl´oBabai, Peter J. Cameron, and P´eterP. P´alfy. On the orders of primitive groups with restricted nonabelian composition factors. J. Algebra, 79(1):161–168, 1982.

137 138 BIBLIOGRAPHY

[14] L´aszl´oBabai, Xi Chen, Xiaorui Sun, Shang-Hua Teng, and John Wilmes. Faster canonical forms for strongly regular graphs. In 54th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2013, 26-29 October, 2013, Berkeley, CA, USA, pages 157–166. IEEE Computer Society, 2013. [15] L´aszl´oBabai and Paolo Codenotti. Isomorhism of hypergraphs of low rank in moderately exponential time. In 49th Annual IEEE Symposium on Foundations of Computer Science, FOCS 2008, October 25-28, 2008, Philadelphia, PA, USA, pages 667–676. IEEE Computer Society, 2008. [16] L´aszl´oBabai, Anuj Dawar, Pascal Schweitzer, and Jacobo Tor´an.The graph isomorphism problem (Dagstuhl seminar 15511). Dagstuhl Reports, 5(12):1–17, 2015. [17] L´aszl´oBabai, Paul Erd¨os,and Stanley M. Selkow. Random graph isomorphism. SIAM J. Comput., 9(3):628–635, 1980. [18] L´aszl´oBabai, D. Yu. Grigoryev, and David M. Mount. Isomorphism of graphs with bounded eigenvalue multiplicity. In Harry R. Lewis, Barbara B. Simons, Walter A. Burkhard, and Lawrence H. Landweber, editors, Proceedings of the 14th Annual ACM Symposium on Theory of Computing, May 5-7, 1982, San Francisco, California, USA, pages 310–324. ACM, 1982. [19] L´aszl´oBabai, William M. Kantor, and Eugene M. Luks. Computational complexity and the classification of finite simple groups. In 24th Annual Symposium on Foundations of Com- puter Science, Tucson, Arizona, USA, 7-9 November 1983, pages 162–171. IEEE Computer Society, 1983. [20] L´aszl´oBabai and L´aszl´oLov´asz. Permutation groups and almost regular graphs. Studia Sci. Math. Hungar., 8:141–150, 1973. [21] L´aszl´oBabai and Eugene M. Luks. Canonical labeling of graphs. In David S. Johnson, , Michael L. Fredman, David Harel, Richard M. Karp, Nancy A. Lynch, Christos H. Papadimitriou, Ronald L. Rivest, Walter L. Ruzzo, and Joel I. Seiferas, editors, Proceedings of the 15th Annual ACM Symposium on Theory of Computing, 25-27 April, 1983, Boston, Massachusetts, USA, pages 171–183. ACM, 1983. [22] L´aszl´oBabai, Eugene M. Luks, and Akos´ Seress. Permutation groups in NC. In Alfred V. Aho, editor, Proceedings of the 19th Annual ACM Symposium on Theory of Computing, 1987, New York, New York, USA, pages 409–420. ACM, 1987. [23] L´aszl´oBabai and John Wilmes. Quasipolynomial-time canonical form for Steiner designs. In , , and Joan Feigenbaum, editors, Symposium on Theory of Computing Conference, STOC’13, Palo Alto, CA, USA, June 1-4, 2013, pages 261–270. ACM, 2013. [24] Christoph Berkholz, Paul S. Bonsma, and Martin Grohe. Tight lower and upper bounds for the complexity of canonical colour refinement. Theory Comput. Syst., 60(4):581–614, 2017. [25] Christoph Berkholz and Martin Grohe. Limitations of algebraic approaches to graph iso- morphism testing. In Magn´usM. Halld´orsson,Kazuo Iwama, Naoki Kobayashi, and Bettina Speckmann, editors, Automata, Languages, and Programming - 42nd International Collo- quium, ICALP 2015, Kyoto, Japan, July 6-10, 2015, Proceedings, Part I, volume 9134 of Lecture Notes in Computer Science, pages 155–166. Springer, 2015. BIBLIOGRAPHY 139

[26] Christoph Berkholz and Martin Grohe. Linear diophantine equations, group csps, and graph isomorphism. In Philip N. Klein, editor, Proceedings of the Twenty-Eighth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2017, Barcelona, Spain, Hotel Porta Fira, January 16-19, pages 327–339. SIAM, 2017.

[27] Hans L. Bodlaender. Polynomial algorithms for graph isomorphism and chromatic index on partial k-trees. J. Algorithms, 11(4):631–643, 1990.

[28] Kellogg S. Booth and Charles J. Colbourn. Problems polynomially equivalent to graph isomorphism. Technical report, Computer Science Department, University of Waterloo, 1979.

[29] Ravi B. Boppana, Johan H˚astad,and Stathis Zachos. Does co-NP have short interactive proofs? Inf. Process. Lett., 25(2):127–132, 1987.

[30] John N. Bray, Derek F. Holt, and Colva M. Roney-Dougal. The Maximal Subgroups of the Low-Dimensional Finite Classical Groups, volume 407 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 2013.

[31] Jin-yi Cai, Martin F¨urer,and . An optimal lower bound on the number of variables for graph identifications. Combinatorica, 12(4):389–410, 1992.

[32] Neil J. Calkin. Dependent sets of constant weight binary vectors. Combinatorics, Proba- bility & Computing, 6(3):263–271, 1997.

[33] Peter J. Cameron. Finite permutation groups and finite simple groups. Bull. London Math. Soc., 13(1):1–22, 1981.

[34] Gang Chen and Ilia N. Ponomarenko. Lectures on coherent configurations. http://www. pdmi.ras.ru/~inp/ccNOTES.pdf, 2019. [35] Paolo Codenotti, Hadi Katebi, Karem A. Sakallah, and Igor L. Markov. Conflict analysis and branching heuristics in the search for graph automorphisms. In 25th IEEE Interna- tional Conference on Tools with Artificial Intelligence, ICTAI 2013, Herndon, VA, USA, November 4-6, 2013, pages 907–914. IEEE Computer Society, 2013.

[36] Charles J. Colbourn and Kellogg S. Booth. Linear time automorphism algorithms for trees, interval graphs, and planar graphs. SIAM J. Comput., 10(1):203–225, 1981.

[37] J. H. Conway, R. T. Curtis, S. P. Norton, R. A. Parker, and R. A. Wilson. Atlas of finite groups. Oxford University Press, Eynsham, 1985. Maximal Subgroups and Ordinary Characters for Simple Groups, With computational assistance from J. G. Thackray.

[38] Bruce N. Cooperstein. Minimal degree for a permutation representation of a classical group. Israel J. Math., 30(3):213–235, 1978.

[39] Derek G. Corneil and Udi Rotics. On the relationship between clique-width and treewidth. SIAM J. Comput., 34(4):825–847, 2005.

[40] Bruno Courcelle, Johann A. Makowsky, and Udi Rotics. Linear time solvable optimization problems on graphs of bounded clique-width. Theory Comput. Syst., 33(2):125–150, 2000.

[41] Bruno Courcelle and Stephan Olariu. Upper bounds to the clique width of graphs. Discrete Applied Mathematics, 101(1-3):77–114, 2000. 140 BIBLIOGRAPHY

[42] Samir Datta, Nutan Limaye, Prajakta Nimbhorkar, Thomas Thierauf, and Fabian Wagner. Planar graph isomorphism is in log-space. In Proceedings of the 24th Annual IEEE Con- ference on Computational Complexity, CCC 2009, Paris, France, 15-18 July 2009, pages 203–214. IEEE Computer Society, 2009.

[43] Anuj Dawar and David Richerby. The power of counting logics on restricted classes of finite structures. In Jacques Duparc and Thomas A. Henzinger, editors, Computer Science Logic, 21st International Workshop, CSL 2007, 16th Annual Conference of the EACSL, Lausanne, Switzerland, September 11-15, 2007, Proceedings, volume 4646 of Lecture Notes in Computer Science, pages 84–98. Springer, 2007.

[44] Holger Dell, Martin Grohe, and Gaurav Rattan. Lov´aszmeets Weisfeiler and Leman. In Ioannis Chatzigiannakis, Christos Kaklamanis, D´anielMarx, and Donald Sannella, editors, 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, volume 107 of LIPIcs, pages 40:1–40:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018.

[45] John D. Dixon and Brian Mortimer. Permutation Groups, volume 163 of Graduate Texts in Mathematics. Springer-Verlag, New York, 1996.

[46] Rodney G. Downey and Michael R. Fellows. Fundamentals of Parameterized Complexity. Texts in Computer Science. Springer, 2013.

[47] Michael Elberfeld and Ken-ichi Kawarabayashi. Embedding and canonizing graphs of bounded genus in Logspace. In David B. Shmoys, editor, Symposium on Theory of Com- puting, STOC 2014, New York, NY, USA, May 31 - June 03, 2014, pages 383–392. ACM, 2014.

[48] Michael Elberfeld and Pascal Schweitzer. Canonizing graphs of bounded tree width in Logspace. TOCT, 9(3):12:1–12:29, 2017.

[49] Wolfgang Espelage, Frank Gurski, and Egon Wanke. How to solve np-hard graph problems on clique-width bounded graphs in polynomial time. In Andreas Brandst¨adtand Van Bang Le, editors, Graph-Theoretic Concepts in Computer Science, 27th International Workshop, WG 2001, Boltenhagen, Germany, June 14-16, 2001, Proceedings, volume 2204 of Lecture Notes in Computer Science, pages 117–128. Springer, 2001.

[50] Sergei Evdokimov and Ilia N. Ponomarenko. Permutation group approach to association schemes. Eur. J. Comb., 30(6):1456–1476, 2009.

[51] Sergei Evdokimov, Ilia N. Ponomarenko, and Gottfried Tinhofer. Forestal algebras and algebraic forests (on a new class of weakly compact graphs). Discrete Mathematics, 225(1- 3):149–172, 2000.

[52] I. S. Filotti and Jack N. Mayer. A polynomial-time algorithm for determining the isomor- phism of graphs of fixed genus (working paper). In Raymond E. Miller, Seymour Ginsburg, Walter A. Burkhard, and Richard J. Lipton, editors, Proceedings of the 12th Annual ACM Symposium on Theory of Computing, April 28-30, 1980, Los Angeles, California, USA, pages 236–243. ACM, 1980.

[53] J¨orgFlum and Martin Grohe. Parameterized Complexity Theory. Texts in Theoretical Computer Science. An EATCS Series. Springer, 2006. BIBLIOGRAPHY 141

[54] Frank Fuhlbr¨uck. Fixed-parameter tractability of the graph isomorphism and canonization problems. Diploma thesis, Humboldt-Universit¨atzu Berlin, 2013. [55] Frank Fuhlbr¨uck, Johannes K¨obler,and Oleg Verbitsky. Identifiability of graphs with small color classes by the Weisfeiler-Leman algorithm. CoRR, abs/1907.02892, 2019. [56] Martin F¨urer.Graph isomorphism testing without numerics for graphs of bounded eigen- value multiplicity. In Kenneth L. Clarkson, editor, Proceedings of the Sixth Annual ACM- SIAM Symposium on Discrete Algorithms, 22-24 January 1995. San Francisco, California, USA., pages 624–631. ACM/SIAM, 1995. [57] Merrick L. Furst, John E. Hopcroft, and Eugene M. Luks. Polynomial-time algorithms for permutation groups. In 21st Annual Symposium on Foundations of Computer Science, Syracuse, New York, USA, 13-15 October 1980, pages 36–41. IEEE Computer Society, 1980. [58] David Gluck, Akos´ Seress, and Aner Shalev. Bases for primitive permutation groups and a conjecture of Babai. J. Algebra, 199(2):367–378, 1998. [59] Oded Goldreich, , and . Proofs that yield nothing but their validity and a methodology of cryptographic protocol design (extended abstract). In 27th Annual Symposium on Foundations of Computer Science, Toronto, Canada, 27-29 October 1986, pages 174–187. IEEE Computer Society, 1986. [60] Shafi Goldwasser and Michael Sipser. Private coins versus public coins in interactive proof systems. In Juris Hartmanis, editor, Proceedings of the 18th Annual ACM Symposium on Theory of Computing, May 28-30, 1986, Berkeley, California, USA, pages 59–68. ACM, 1986. [61] Martin Grohe. Fixed-point logics on planar graphs. In Thirteenth Annual IEEE Symposium on Logic in Computer Science, Indianapolis, Indiana, USA, June 21-24, 1998, pages 6–15. IEEE Computer Society, 1998. [62] Martin Grohe. Isomorphism testing for embeddable graphs through definability. In F. Frances Yao and Eugene M. Luks, editors, Proceedings of the Thirty-Second Annual ACM Symposium on Theory of Computing, May 21-23, 2000, Portland, OR, USA, pages 63–72. ACM, 2000. [63] Martin Grohe. Fixed-point definability and polynomial time on graphs with excluded minors. J. ACM, 59(5):27:1–27:64, 2012. [64] Martin Grohe. Descriptive Complexity, Canonisation, and Definable Graph Structure The- ory, volume 47 of Lecture Notes in Logic. Association for Symbolic Logic, Ithaca, NY; Cambridge University Press, Cambridge, 2017. [65] Martin Grohe and Sandra Kiefer. A linear upper bound on the Weisfeiler-Leman dimension of graphs of bounded genus. In Christel Baier, Ioannis Chatzigiannakis, Paola Flocchini, and Stefano Leonardi, editors, 46th International Colloquium on Automata, Languages, and Programming, ICALP 2019, July 9-12, 2019, Patras, Greece., volume 132 of LIPIcs, pages 117:1–117:15. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2019. [66] Martin Grohe and Julian Mari˜no.Definability and descriptive complexity on databases of bounded tree-width. In Catriel Beeri and Peter Buneman, editors, Database Theory - ICDT ’99, 7th International Conference, Jerusalem, Israel, January 10-12, 1999, Proceedings., volume 1540 of Lecture Notes in Computer Science, pages 70–82. Springer, 1999. 142 BIBLIOGRAPHY

[67] Martin Grohe and D´aniel Marx. Structure theorem and isomorphism test for graphs with excluded topological subgraphs. SIAM J. Comput., 44(1):114–159, 2015.

[68] Martin Grohe and Daniel Neuen. Canonisation and definability for graphs of bounded rank width. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, Vancouver, BC, Canada, June 24-27, 2019, pages 1–13. IEEE, 2019.

[69] Martin Grohe, Daniel Neuen, and Pascal Schweitzer. A faster isomorphism test for graphs of small degree. In Mikkel Thorup, editor, 59th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2018, Paris, France, October 7-9, 2018, pages 89–100. IEEE Computer Society, 2018.

[70] Martin Grohe, Daniel Neuen, Pascal Schweitzer, and Daniel Wiebking. An improved iso- morphism test for bounded-tree-width graphs. In Ioannis Chatzigiannakis, Christos Kak- lamanis, D´anielMarx, and Donald Sannella, editors, 45th International Colloquium on Automata, Languages, and Programming, ICALP 2018, July 9-13, 2018, Prague, Czech Republic, volume 107 of LIPIcs, pages 67:1–67:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2018.

[71] Martin Grohe and Martin Otto. Pebble games and linear equations. J. Symb. Log., 80(3):797–844, 2015.

[72] Martin Grohe and Wied Pakusa. Descriptive complexity of linear equation systems and applications to propositional proof complexity. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, pages 1–12. IEEE Computer Society, 2017.

[73] Martin Grohe and Pascal Schweitzer. Isomorphism testing for graphs of bounded rank width. In Venkatesan Guruswami, editor, IEEE 56th Annual Symposium on Foundations of Computer Science, FOCS 2015, Berkeley, CA, USA, 17-20 October, 2015, pages 1010– 1029. IEEE Computer Society, 2015.

[74] Martin Grohe and Pascal Schweitzer. Computing with tangles. SIAM J. Discrete Math., 30(2):1213–1247, 2016.

[75] Yuri Gurevich and Saharon Shelah. On finite rigid structures. J. Symb. Log., 61(2):549–562, 1996.

[76] Lauri Hella. Logical hierarchies in PTIME. Inf. Comput., 129(1):1–19, 1996.

[77] Derek F. Holt, Bettina Eick, and Eamonn A. O’Brien. Handbook of Computational Group Theory. Discrete Mathematics and its Applications (Boca Raton). Chapman & Hall/CRC, Boca Raton, FL, 2005.

[78] John E. Hopcroft and Robert Endre Tarjan. A v2 algorithm for determining isomorphism of planar graphs. Inf. Process. Lett., 1(1):32–34, 1971.

[79] John E. Hopcroft and Robert Endre Tarjan. Isomorphism of planar graphs. In Raymond E. Miller and James W. Thatcher, editors, Proceedings of a symposium on the Complexity of Computer Computations, held March 20-22, 1972, at the IBM Thomas J. Watson Research Center, Yorktown Heights, New York, USA, The IBM Research Symposia Series, pages 131–152. Plenum Press, New York, 1972. BIBLIOGRAPHY 143

[80] John E. Hopcroft and Robert Endre Tarjan. A V log V algorithm for isomorphism of triconnected planar graphs. J. Comput. Syst. Sci., 7(3):323–331, 1973. [81] John E. Hopcroft and J. K. Wong. Linear time algorithm for isomorphism of planar graphs (preliminary report). In Robert L. Constable, Robert W. Ritchie, Jack W. Carlyle, and Michael A. Harrison, editors, Proceedings of the 6th Annual ACM Symposium on Theory of Computing, April 30 - May 2, 1974, Seattle, Washington, USA, pages 172–184. ACM, 1974. [82] Neil Immerman and Eric Lander. Describing graphs: A first-order approach to graph canonization. In Alan L. Selman, editor, Complexity Theory Retrospective: In Honor of Juris Hartmanis on the Occasion of His Sixtieth Birthday, July 5, 1988, pages 59–81. Springer New York, New York, NY, 1990. [83] Russell Impagliazzo and Ramamohan Paturi. On the complexity of k-SAT. J. Comput. Syst. Sci., 62(2):367–375, 2001. [84] Birgit Jenner, Johannes K¨obler,Pierre McKenzie, and Jacobo Tor´an.Completeness results for graph isomorphism. J. Comput. Syst. Sci., 66(3):549–566, 2003. [85] Tommi A. Junttila and Petteri Kaski. Engineering an efficient canonical labeling tool for large and sparse graphs. In Proceedings of the Nine Workshop on Algorithm Engineering and Experiments, ALENEX 2007, New Orleans, Louisiana, USA, January 6, 2007. SIAM, 2007. [86] Tommi A. Junttila and Petteri Kaski. Conflict propagation and component recursion for canonical labeling. In Alberto Marchetti-Spaccamela and Michael Segal, editors, Theory and Practice of Algorithms in (Computer) Systems - First International ICST Conference, TAPAS 2011, Rome, Italy, April 18-20, 2011. Proceedings, volume 6595 of Lecture Notes in Computer Science, pages 151–162. Springer, 2011. [87] William M. Kantor and Eugene M. Luks. Computing in quotient groups. In Harriet Ortiz, editor, Proceedings of the 22nd Annual ACM Symposium on Theory of Computing, May 13-17, 1990, Baltimore, Maryland, USA, pages 524–534. ACM, 1990. [88] Ken-ichi Kawarabayashi. Graph isomorphism for bounded genus graphs in linear time. CoRR, abs/1511.02460, 2015. [89] Sandra Kiefer and Daniel Neuen. The power of the Weisfeiler-Leman algorithm to de- compose graphs. In Peter Rossmanith, Pinar Heggernes, and Joost-Pieter Katoen, editors, 44th International Symposium on Mathematical Foundations of Computer Science, MFCS 2019, August 26-30, 2019, Aachen, Germany., volume 138 of LIPIcs, pages 45:1–45:15. Schloss Dagstuhl - Leibniz-Zentrum f¨urInformatik, 2019. [90] Sandra Kiefer, Ilia N. Ponomarenko, and Pascal Schweitzer. The Weisfeiler-Leman dimen- sion of planar graphs is at most 3. In 32nd Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2017, Reykjavik, Iceland, June 20-23, 2017, pages 1–12. IEEE Computer Society, 2017. [91] Sandra Kiefer, Pascal Schweitzer, and Erkal Selman. Graphs identified by logics with counting. In Giuseppe F. Italiano, Giovanni Pighizzini, and Donald Sannella, editors, Mathematical Foundations of Computer Science 2015 - 40th International Symposium, MFCS 2015, Milan, Italy, August 24-28, 2015, Proceedings, Part I, volume 9234 of Lecture Notes in Computer Science, pages 319–330. Springer, 2015. 144 BIBLIOGRAPHY

[92] Peter Kleidman and Martin W. Liebeck. The Subgroup Structure of the Finite Classical Groups, volume 129 of London Mathematical Society Lecture Note Series. Cambridge University Press, Cambridge, 1990.

[93] Jon M. Kleinberg and Eva´ Tardos. Algorithm Design. Addison-Wesley, 2006.

[94] Ton Kloks. Treewidth, Computations and Approximations, volume 842 of Lecture Notes in Computer Science. Springer, 1994.

[95] Johannes K¨obler.On graph isomorphism for restricted graph classes. In Arnold Beckmann, Ulrich Berger, Benedikt L¨owe, and John V. Tucker, editors, Logical Approaches to Com- putational Barriers, Second Conference on Computability in Europe, CiE 2006, Swansea, UK, June 30-July 5, 2006, Proceedings, volume 3988 of Lecture Notes in Computer Science, pages 241–256. Springer, 2006.

[96] Stefan Kratsch and Pascal Schweitzer. Graph isomorphism for graph classes characterized by two forbidden induced subgraphs. Discrete Applied Mathematics, 216:240–253, 2017.

[97] Nils M. Kriege, Fredrik D. Johansson, and Christopher Morris. A survey on graph kernels. CoRR, abs/1903.11835, 2019.

[98] LudˇekKuˇcera.Canonical labeling of regular graphs in linear average time. In 28th Annual Symposium on Foundations of Computer Science, Los Angeles, California, USA, 27-29 October 1987, pages 271–279. IEEE Computer Society, 1987.

[99] Hanns-Georg Leimer. Optimal decomposition by clique separators. Discrete Mathematics, 113(1-3):99–123, 1993.

[100] Martin W. Liebeck. On minimal degrees and base sizes of primitive permutation groups. Arch. Math. (Basel), 43(1):11–15, 1984.

[101] Martin W. Liebeck and Aner Shalev. Simple groups, permutation groups, and probability. J. Amer. Math. Soc., 12(2):497–520, 1999.

[102] Martin W. Liebeck and Aner Shalev. Bases of primitive linear groups. J. Algebra, 252(1):95–113, 2002.

[103] Martin W. Liebeck and Aner Shalev. Bases of primitive linear groups II. J. Algebra, 403:223–228, 2014.

[104] Daniel Lokshtanov, Marcin Pilipczuk, Michal Pilipczuk, and Saket Saurabh. Fixed- parameter tractable canonization and isomorphism test for graphs of bounded treewidth. SIAM J. Comput., 46(1):161–189, 2017.

[105] Jos´eLuis L´opez-Presa, Antonio Fern´andezAnta, and Luis N´u˜nezChiroque. Conauto-2.0: Fast isomorphism testing and automorphism group computation. CoRR, abs/1108.1060, 2011.

[106] Eugene M. Luks. Isomorphism of graphs of bounded valence can be tested in polynomial time. J. Comput. Syst. Sci., 25(1):42–65, 1982.

[107] Eugene M. Luks. Computing the composition factors of a permutation group in polynomial time. Combinatorica, 7(1):87–99, 1987. BIBLIOGRAPHY 145

[108] Eugene M. Luks. Permutation groups and polynomial-time computation. In Groups and computation (New Brunswick, NJ, 1991), volume 11 of DIMACS Ser. Discrete Math. The- oret. Comput. Sci., pages 139–175. Amer. Math. Soc., Providence, RI, 1993.

[109] Attila Mar´oti.On the orders of primitive groups. J. Algebra, 258(2):631–640, 2002.

[110] Rudolf Mathon. A note on the graph isomorphism counting problem. Inf. Process. Lett., 8(3):131–132, 1979.

[111] Annabelle McIver and Peter M. Neumann. Enumerating finite groups. Quart. J. Math. Oxford Ser. (2), 38(152):473–488, 1987.

[112] Brendan D. McKay. Practical graph isomorphism. Congr. Numer., 30:45–87, 1981.

[113] Brendan D. McKay and Adolfo Piperno. Practical graph isomorphism, II. J. Symb. Com- put., 60:94–112, 2014.

[114] Ulrich Meierfrankenfeld. Non-finitary locally finite simple groups. In Finite and locally finite groups (Istanbul, 1994), volume 471 of NATO Adv. Sci. Inst. Ser. C Math. Phys. Sci., pages 189–212. Kluwer Acad. Publ., Dordrecht, 1995.

[115] Gary L. Miller. Graph isomorphism, general remarks. J. Comput. Syst. Sci., 18(2):128–142, 1979.

[116] Gary L. Miller. Isomorphism testing for graphs of bounded genus. In Raymond E. Miller, Seymour Ginsburg, Walter A. Burkhard, and Richard J. Lipton, editors, Proceedings of the 12th Annual ACM Symposium on Theory of Computing, April 28-30, 1980, Los Angeles, California, USA, pages 225–235. ACM, 1980.

[117] Gary L. Miller. Isomorphism of graphs which are pairwise k-separable. Information and Control, 56(1/2):21–33, 1983.

[118] Takunari Miyazaki. The complixity of McKay’s canonical labeling algorithm. In Larry Finkelstein and William M. Kantor, editors, Groups and Computation, Proceedings of a DIMACS Workshop, New Brunswick, New Jersey, USA, June 7-10, 1995, volume 28 of DIMACS Series in Discrete Mathematics and Theoretical Computer Science, pages 239– 256. DIMACS/AMS, 1995.

[119] Christopher Morris and Petra Mutzel. Towards a practical k-dimensional Weisfeiler-Leman algorithm. CoRR, abs/1904.01543, 2019.

[120] Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. Weisfeiler and Leman go neural: Higher-order graph neural networks. In The Thirty-Third AAAI Conference on Artificial Intelligence, AAAI 2019, The Thirty-First Innovative Applications of Artificial Intelligence Conference, IAAI 2019, The Ninth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2019, Honolulu, Hawaii, USA, January 27 - February 1, 2019., pages 4602–4609. AAAI Press, 2019.

[121] and Prabhakar Raghavan. Randomized Algorithms. Cambridge University Press, 1995. 146 BIBLIOGRAPHY

[122] Daniel Neuen. Graph isomorphism for unit square graphs. In Piotr Sankowski and Chris- tos D. Zaroliagis, editors, 24th Annual European Symposium on Algorithms, ESA 2016, August 22-24, 2016, Aarhus, Denmark, volume 57 of LIPIcs, pages 70:1–70:17. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2016.

[123] Daniel Neuen and Pascal Schweitzer. Benchmark graphs for practical graph isomorphism. In Kirk Pruhs and Christian Sohler, editors, 25th Annual European Symposium on Al- gorithms, ESA 2017, September 4-6, 2017, Vienna, Austria, volume 87 of LIPIcs, pages 60:1–60:14. Schloss Dagstuhl - Leibniz-Zentrum fuer Informatik, 2017.

[124] Daniel Neuen and Pascal Schweitzer. An exponential lower bound for individualization- refinement algorithms for graph isomorphism. In Ilias Diakonikolas, David Kempe, and Monika Henzinger, editors, Proceedings of the 50th Annual ACM SIGACT Symposium on Theory of Computing, STOC 2018, Los Angeles, CA, USA, June 25-29, 2018, pages 138–150. ACM, 2018.

[125] Giannis Nikolentzos, Giannis Siglidis, and Michalis Vazirgiannis. Graph kernels: A survey. CoRR, abs/1904.12218, 2019.

[126] Ryan O’Donnell, John Wright, Chenggang Wu, and Yuan Zhou. Hardness of robust graph isomorphism, Lasserre gaps, and asymmetry of random graphs. In Chandra Chekuri, editor, Proceedings of the Twenty-Fifth Annual ACM-SIAM Symposium on Discrete Algorithms, SODA 2014, Portland, Oregon, USA, January 5-7, 2014, pages 1659–1677. SIAM, 2014.

[127] Yota Otachi and Pascal Schweitzer. Isomorphism on subgraph-closed graph classes: A complexity dichotomy and intermediate graph classes. In Leizhen Cai, Siu-Wing Cheng, and Tak Wah Lam, editors, Algorithms and Computation - 24th International Symposium, ISAAC 2013, Hong Kong, China, December 16-18, 2013, Proceedings, volume 8283 of Lecture Notes in Computer Science, pages 111–118. Springer, 2013.

[128] Yota Otachi and Pascal Schweitzer. Reduction techniques for graph isomorphism in the context of width parameters. In R. Ravi and Inge Li Gørtz, editors, Algorithm Theory - SWAT 2014 - 14th Scandinavian Symposium and Workshops, Copenhagen, Denmark, July 2-4, 2014. Proceedings, volume 8503 of Lecture Notes in Computer Science, pages 368–379. Springer, 2014.

[129] Sang-il Oum. Rank-width is less than or equal to branch-width. Journal of Graph Theory, 57(3):239–244, 2008.

[130] Sang-il Oum and Paul D. Seymour. Approximating clique-width and branch-width. J. Comb. Theory, Ser. B, 96(4):514–528, 2006.

[131] Ilia N. Ponomarenko. The isomorphism problem for classes of graphs closed under con- traction. Journal of Soviet Mathematics, 55(2):1621–1643, Jun 1991.

[132] L´aszl´oPyber. A CFSG-free analysis of Babai’s quasipolynomial GI-algorithm. CoRR, abs/1605.08266, 2016.

[133] Ronald C. Read and Derek G. Corneil. The graph isomorphism disease. Journal of Graph Theory, 1:339–363, 1977.

[134] Joachim Redies. Defining PTIME problems on planar graphs with few variables. Master’s thesis, RWTH Aachen University, 2014. BIBLIOGRAPHY 147

[135] David J. Rosenbaum. Bidirectional collision detection and faster deterministic isomorphism testing. CoRR, abs/1304.3935, 2013.

[136] Joseph J. Rotman. An Introduction to the Theory of Groups, volume 148 of Graduate Texts in Mathematics. Springer-Verlag, New York, fourth edition, 1995.

[137] Pascal Schweitzer and Daniel Wiebking. A unifying method for the design of algorithms canonizing combinatorial objects. In Moses Charikar and Edith Cohen, editors, Proceedings of the 51st Annual ACM SIGACT Symposium on Theory of Computing, STOC 2019, Phoenix, AZ, USA, June 23-26, 2019., pages 1247–1258. ACM, 2019.

[138] Akos´ Seress. Permutation Group Algorithms, volume 152 of Cambridge Tracts in Mathe- matics. Cambridge University Press, Cambridge, 2003.

[139] Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. Weisfeiler-Lehman graph kernels. Journal of Machine Learning Research, 12:2539–2561, 2011.

[140] Aaron Snook, Grant Schoenebeck, and Paolo Codenotti. Graph isomorphism and the Lasserre hierarchy. CoRR, abs/1401.0758, 2014.

[141] Daniel A. Spielman. Faster isomorphism testing of strongly regular graphs. In Gary L. Miller, editor, Proceedings of the Twenty-Eighth Annual ACM Symposium on the Theory of Computing, Philadelphia, Pennsylvania, USA, May 22-24, 1996, pages 576–584. ACM, 1996.

[142] Xiaorui Sun and John Wilmes. Faster canonical forms for primitive coherent configurations: Extended abstract. In Rocco A. Servedio and Ronitt Rubinfeld, editors, Proceedings of the Forty-Seventh Annual ACM on Symposium on Theory of Computing, STOC 2015, Portland, OR, USA, June 14-17, 2015, pages 693–702. ACM, 2015.

[143] Xiaorui Sun and John Wilmes. Structure and automorphisms of primitive coherent con- figurations. CoRR, abs/1510.02195, 2015.

[144] Seinosuke Toda. PP is as hard as the polynomial-time hierarchy. SIAM J. Comput., 20(5):865–877, 1991.

[145] Jacobo Tor´an.On the hardness of graph isomorphism. SIAM J. Comput., 33(5):1093–1108, 2004.

[146] Salil P. Vadhan. Pseudorandomness. Foundations and Trends in Theoretical Computer Science, 7(1-3):1–336, 2012.

[147] Steffen van Bergerem. Learning concepts definable in first-order logic with counting. In 34th Annual ACM/IEEE Symposium on Logic in Computer Science, LICS 2019, Vancouver, BC, Canada, June 24-27, 2019, pages 1–13. IEEE, 2019.

[148] Boris Weisfeiler. On Construction and Identification of Graphs, volume 558 of Lecture Notes in Mathematics. Springer-Verlag, 1976.

[149] Boris Weisfeiler and Andrei Leman. The reduction of a graph to canonical form and the algebra which appears therein. NTI, Series 2, 1968. English transalation by G. Ryabov available at https://www.iti.zcu.cz/wl2018/pdf/wl_paper_translation.pdf. 148 BIBLIOGRAPHY

[150] Daniel Wiebking. Graph isomorphism in quasipolynomial time parameterized by treewidth. CoRR, abs/1911.11257, 2019. [151] Viktor N. Zemlyachenko, Nikolai M. Korneenko, and Regina I. Tyshkevich. The graph isomorphism problem. Zap. Nauchn. Sem. Leningrad. Otdel. Mat. Inst. Steklov. (LOMI), 118:83–158, 215, 1982. The theory of the complexity of computations, I.