Expressive Power of Graph Languages
Total Page:16
File Type:pdf, Size:1020Kb
Expressive Power of Graph Languages Timos Antonopoulos University of Cambridge Computer Laboratory St. Catharine’s College This dissertation is submitted for the degree of Doctor of Philosophy Abstract Finite model theory is a field of logic that originated from computer science and studies properties of finite logical structures. Algorithmic properties of problems are interpreted over structures and inspected using logical languages, and thus connections are made between the latter and computational models. The expressive power of such logical languages is examined and matched with classes of problems and algorithmic classifications. In this thesis, we look into the expressive power of Graph Logic and also its extension with a recursion operator, that was introduced by Cardelli, Gardner and Ghelli, and later systematically investigated by Dawar, Gardner and Ghelli. Graph Logic was introduced as a query language on labelled directed graphs and is an extension of first-order logic with a spatial connective that allows one to express that a graph can be decomposed into two subgraphs, and reason thereafter about the properties definable over the two subgraphs in isolation from each other. It was known that Graph Logic is contained in monadic second-order logic over graphs and we show that the containment is strict. Furthermore, we investigate the expressive power of the logic over restricted classes of graphs. In addition, the expressive power of Graph Logic with the recursion operator, which over many classes of graphs exceeds the one of monadic second-order logic, is studied and compared to the expressive power of other related languages. 2 Preface First of all, I would like to thank my supervisor Dr. Anuj Dawar, without whose continued support and guidance none of this work would have been possible. I am also very grateful for his encouragement and patience throughout my studies. I am especially thankful to Dr. Jerzy Marcinkowski for a series of valuable discussions and sharing his insight with me. I would also like to thank Prof. Wolfgang Thomas and Prof. Martin Lange for kindly discussing specific aspects relating to my work. Furthermore, I would like to acknowledge Dr. Philippa Gardner for her advice prior to my taking up my PhD studies. Finally, I would like to dedicate my thesis to my father, mother and sister, without whom I could have never accomplished any of this work. This dissertation is the result of my own work and includes nothing which is the outcome of work done in collaboration except where specifically indicated in the text. Parts of Chapter 2 and Chapter 3 are based on [AD09]. This dissertation does not exceed 60,000 words, including footnotes. 3 Contents Abstract 2 Preface 3 1 Introduction 6 1.1 Preliminaries . 8 1.1.1 Graphs . 9 1.1.2 Words . 11 1.1.3 Trees . 12 1.2 Monadic Second-Order Logic . 13 1.2.1 MS1 and MS2 .................................. 17 1.3 Word and Tree Automata . 19 1.4 Fixed-Point Logics . 20 1.5 Graph Logic . 21 1.5.1 GL Games . 24 1.6 Graph Logic with Recursion . 26 1.7 Contributions of this Thesis . 27 2 Graph Logic 29 2.1 On Graphs . 31 2.2 On Words . 32 2.3 On Trees . 37 2.3.1 Chain and Antichain Logic . 42 2.3.2 Tree Walking Automata . 50 3 GL(MSO on Forests 53 3.1 Main Theorem . 53 3.2 FO Interpretations and MSO Transductions . 68 3.3 Separation Logic . 73 4 4 Graph Logic with Recursion 78 4.1 On Graphs . 78 4.2 On Words . 79 4.2.1 PSPACE-complete problems on strings . 79 4.2.2 Conjunctive Grammars and their Boolean Closure . 85 4.3 On Trees . 96 5 Conclusion 104 5.1 GL . 104 5.2 GLµ ...........................................105 Bibliography 106 5 Chapter 1 Introduction A connection between Mathematical Logic and Automata Theory was first established by B¨uchi and Elgot’s result ([B¨uc60],[Elg61]) stating that the Monadic fragment of Second Order Logic (the one where second order variables range over subsets of the universe of the structure) can define exactly the same languages over finite words as the ones recognized by finite word au- tomata. This result was among the first that established a connection between a logic and a class of problems decidable by some theoretical machine model, such as automata or Turing machines, which is one of the fundamental issues of Finite Model Theory, along with the search for the exact expressive power of logical languages. Other similar connections made between a machine model and a logic include Fagin’s famous result showing that ∃SO logic, the existential fragment of Second-Order Logic, describes exactly the problems in NP ([Fag74]), as well as Immerman and Vardi’s result on LFP and P over ordered structures ([Imm86],[Var82]). It has been observed that Monadic Second-Order Logic (MSO), over some classes of graphs which include words and trees, has linear data complexity. In other words, evaluating a fixed MSO formula on some arbitrary structure contained in one of these classes, takes linear time with respect to the size of the structure. This is possible due to a translation of MSO formulae to finite automata. Due to this connection between MSO and finite automata, MSO has been the standard to which many other logics are usually compared, especially when it comes to structures such as words and trees. Graph Logic (GL), introduced by Cardelli et al. in [CGG02], is a spatial logic for querying graphs, with its characteristic operator being a spatial connective that allows for decomposing a graph into two edge-disjoint components, akin to how Separation Logic ([Rey02],[IO01]) is designed to work on memory heaps and other structures used for reasoning in programs with direct memory management through pointer variables. In [CG00], Cardelli et al. introduced Ambient Logic that allow for reasoning on the way the properties of processes change through time as well as spatially. In other words, through the logic, one can express what are the allowable changes on the spatial properties of a process, which are represented as a graph algebra, and for this, Ambient Logic makes use of a composition operator, similar to the one in GL. On the other hand, logics in finite model theory are traditionally used to reason on what is expressible over 6 CHAPTER 1. INTRODUCTION 7 certain classes of structures. The version of GL introduced in [CGG02] allows for expressing properties about graphs, and in addition includes operators allowing for querying graphs, and as a result creating new graphs from input graphs. Furthermore, one can also define transducers through the logic, which are mappings from graphs to graphs. A fragment of GL introduced in [CGG02], which includes only the connectives used for expressing properties of graphs, was further studied in [DGG07] in order to resolve questions regarding the expressiveness of the logic as well as the complexity of the graphs definable in it and the complexity of evaluating the formulae. Two versions where investigated, one equipped with a recursion operator similar to the one in [CGG02], and the other version being without it. The two versions were termed GLµ and GL respectively. The characteristic spatial connective in the version of Graph Logic considered in [DGG07], is a composition operator that allows for reasoning about disjoint subgraphs in a graph. It is essentially a restricted form of second-order quantification overs edges, where in particular, one can express that a graph can be decomposed into two edge disjoint subgraphs, and then reason in the same way about definable properties of the subgraphs in isolation from each other. This spatial connective can be simulated in monadic second-order logic with quantification over edges, as shown in [DGG07], and hence all properties definable in Graph Logic are definable in MSO as well. The opposite direction of the containment is the one of interest. Dawar et al. observed in [DGG07] that GL is of equal expressive power to MSO when restricted to the class of words, and hence captures the regular word languages. Despite this result, it was conjectured that it is strictly less expressive than MSO over graphs in general. In [Mar06], Marcinkowski showed that this is indeed the case for the GL version studied in [DGG07] and a corresponding extension of MSO that allows for quantification over a countable set of labels which label the edges of the structures. The two logics were termed GL+ and MSO+ in [Mar06], to avoid confusion with the version of GL and MSO that lacks this extra quantification over edge labels. The question whether GL is strictly less expressive than MSO was thus left open. Graph Logic with an additional operator for recursion (GLµ) was also introduced in [CGG02] and the expressive power of a version of it was studied further in [DGG07]. The latter operator increases the expressive power of GL, and in some classes of graphs such as trees and words, it is shown that it becomes more expressive than MSO. On the other hand, over graphs in general it is believed that there are MSO definable properties, such as 3-colourability, that are not definable in GLµ. Despite the fact that the expressive power is increased with the use of the recursion operator, and the fact that PSPACE-complete problems are expressible in GLµ, model checking as well as evaluating a fixed formula on an arbitrary graph, both remain in PSPACE. This is interesting since not many logics have data complexity and model-checking complexity that are both complete for the same complexity class. Least Fixed-Point Logic for example can only define problems in P but its model-checking complexity is EXPTIME-complete and it should be noted that model-checking of Graph Logic without the recursion operator is also PSPACE- complete.