Orbit-Finite Sets
Total Page:16
File Type:pdf, Size:1020Kb
Slightly Infinite Sets Mikołaj Bojanczyk´ September 11, 2019 The latest version can be downloaded from: mimuw.edu.pl/∼bojan/paper/atom-book Contents Preface page v Part I Automata for data words 1 1 Register automata 4 1.1 Nondeterministic register automata 5 1.2 Emptiness and universality for register automata 10 1.3 Alternating register automata 13 1.4 Most models of register automata are inequivalent 22 2 Two variable logic and data automata 25 2.1 Data automata 25 2.2 Two-variable first-order logic on data words 30 Part II Orbit-finite sets 43 3 Sets with atoms and orbit-finiteness 46 3.1 Sets with atoms 47 3.2 Orbit-finiteness 52 4 Representing orbit-finite sets 64 4.1 Set builder expressions 65 4.2 Hereditarily orbit-finite sets 73 5 Case studies 79 5.1 Graph reachability 79 5.2 Orbit-finite automata 83 5.3 Pushdown automata and context-free grammars 91 5.4 Graph homomorphisms 96 iii iv Contents 5.5 Systems of equations 101 6 Least supports 104 6.1 Least supports 104 6.2 Extended example: deterministic automata 109 7 Homogeneous atoms 113 7.1 Homogeneous structures 113 7.2 The Fra¨ısse´ limit 117 7.3 Examples of homogeneous atoms 127 7.3.1 The random graph 127 7.3.2 Bit vectors 130 7.3.3 Trees and forests 136 Part III Computation with atoms 139 8 Computable functions on sets with atoms 142 8.1 While programs with atoms 143 8.2 Computational completeness of while programs 151 9 Fixed dimension polynomial time 160 9.1 Fixed dimension polynomial time on set builder expressions 161 9.2 A semantic version 172 10 Turing machines 180 10.1 Orbit-finite Turing machines 181 10.2 For bit vector atoms, P , NP 190 10.3 For equality atoms, Turing machines cannot be deter- minised 194 Part IV Solutions to the exercises 203 Bibliography 263 Bibliography 263 Author index 269 Subject index 270 Preface This book is about algorithms that run on objects that are infinite, but finite up to certain symmetries. Under a suitably chosen notion of symmetry, such ob- jects – called orbit-finite sets – can be represented, searched and processed just like finite sets. The goal of the book is to explain orbit-finiteness and demon- strate its usefulness. Most of the examples of orbit-finite sets are taken from automata theory, since this is where orbit-finite sets began. v PARTONE AUTOMATA FOR DATA WORDS 3 We begin with an investigation of concrete automata models for words over infinite alphabets. The goal of this part is to build intuitions for the more ab- stract models that will be presented in the later parts. 1 Register automata A data word is a word where each letter carries two pieces of information: a label from a finite set, and a data value from an infinite set. Here is a picture: labels from {a, b} a a b a b b a b b b a a 1 1 2 4 2 5 3 7 1 8 2 9 data values from {1, 2, 3,...} For the rest of Part I, fix a countably infinite set A. Elements of this set, called the atoms, will be used for the data values. Formally, a data word over a finite set of labels Σ is defined to be a word in w 2 Σ × ∗: |{z} |{z}A label data value When describing properties of data words, we will be able to test the labels explicitly by asking questions like does the second letter have a 2 Σ as its label? but we will only test the data values for equality, e.g. ask do the third and fifth letters have the same data value? Later in the book, we will formalise what it means to only test data values for equality, but for now the intuitive understanding should be enough. ∗ Example 1.1. By abuse of notation, we assume that a word in A is also a data word, which uses no labels. Here are examples of languages of data words, in all of these examples we use no labels: 4 1.1 Nondeterministic register automata 5 (1) the first data value is the same as the last data value; (2) some data value appears twice; (3) no data value appears twice; (4) the first data value appears again; (5) every three consecutive data values are pairwise distinct. We will introduce automata models for data words that recognise the above languages. These models use registers to talk about data values. 1.1 Nondeterministic register automata We begin our discussion with some of the simplest automaton models for data words, namely nondeterministic and deterministic register automata1. Definition 1.2 (Nondeterministic register automaton). The syntax of a nonde- terministic register automaton consists of: • a finite set Σ of labels; • a finite set Loc of locations2; • a finite set R of register names; • an initial location `0 2 Loc and a set of accepting locations F ⊆ Loc; • a transition relation register valuations, i.e. partial functions from registers to atoms z }| { R R δ ⊆ Loc × (A [ f?g) × Σ × A × Loc × (A [ f?g) (1.1) | {z } |{z} | {z } states input states subject to an equivariance condition described below. The automaton is used to accept or reject data words with labels Σ, i.e. words where each position is labelled by Σ × A. After processing part of the input, the automaton keeps track of a state, which is defined to be a location plus a register valuation. Initially, the state consists of the initial location and a com- pletely undefined register valuation. For each input letter, the state is updated according to the transition relation δ, and the automaton accepts if at the end 1 Register automata where introduced in Kaminski and Francez (1994), under the name of finite memory automata, together with a decidability proof for the emptiness problems in the deterministic and nondeterministic one-way cases (Theorem 1.7 in this text). The presentation using syntactic and semantics equivariance, in particular Lemma 1.3, is essentially due to Bojanczyk´ (2013); Bojanczyk´ et al. (2014). 2 We use the name location instead of state because the state of the automaton will store additional information, namely the contents of the registers. 6 Register automata of the input word the state is accepting, in the sense that the location belongs to the accepting set. How is the transition relation described? Since the state space is infinite, some restrictions on the transition relation are needed to represent it in a finite way. We choose the following restriction, called equivariance: the transition relation can only compare atoms with respect to equality, and is not allowed to depend on any specific atoms. Equivariance can be formalised in two different ways, as described below. Semantic equivariance. A permutation π : A ! A of the atoms (i.e. a bi- jection from the atoms to themselves) can be applied to states in the natural way, and therefore also to triples in the transition relation δ (the locations and undefined values are not affected, only the atoms). Here is a picture: input letter with red label and atom 1 1 2, ⊥ 1, 2 π(1) = 5 π(2) = 3 state with orange location, π atom 2 in the first register, and undefined second register 5 3, ⊥ 5, 3 We say that δ is semantically equivariant if the set of transitions is invariant under actions of atom permutations, i.e. π(t) 2 δ for every t 2 δ and every permutation π : A ! A: The advantage of semantic equivariance is that the definition is short, and easy to generalise to other models like alternating automata or pushdown automata. The disadvantage is that it is not clear how to represent a semantically equiv- ariant transition relation, e.g. for the input of a nonemptiness algorithm. The converse situation holds for syntactic equivariance, as presented below. Syntactic equivariance. We say that δ is syntactically equivariant if it can be defined by a finite boolean combination of constraints of the following types: (1) the location in the source / target state is ` 2 Loc; (2) the label in the input letter is a 2 Σ; (3) register r 2 R is undefined in the source / target state; (4) the atom in the input letter is the same as in register r 2 R of the source / target state; 1.1 Nondeterministic register automata 7 (5) register r 2 R of the source / target state stores the same atom as register s 2 R of the source / target state. In the above, source / target means that the constraint can be instantiated with either “source” or “target”. For example, in item (5), there are four possibilities regarding the choice of source vs target, since the choice is taken independently for r and s. Lemma 1.3. Semantic and syntactic equivariance are the same. Proof It is not difficult to see that semantically equivariant subsets of the set (1.1) are closed under boolean combinations. Since the bijections of data values do not affect satisfaction of the constraints 1-5 used in the definition of syntactic equivariance, it follows that syntactic equivariance implies semantic equivariance. We now show that semantic implies syntactic. Define an orbit of transitions to be a set of the form fπ(t): π is a permutation of the atomsg for some transition t 2 δ. Here is a picture of an orbit of transitions: 1 2, ⊥ 1, 2 2 3, ⊥ 2, 3 5 1, ⊥ 5, 1 4 3, ⊥ 4, 3 ..