1

Finite :

First-Order on the of Finite Models

Anuj Dawar University of Cambridge

Modnet Tutorial, La Roche, 21 April 2008

Anuj Dawar February 2008 2

Finite Model Theory

In the 1980s, the term finite model theory came to be used to describe the study of the expressive power of (from first-order to second-order logic and in between), on the class of all finite structures.

The motivation for the study is that problems in computer science (especially in complexity theory and database theory) are naturally expressed as questions about the expressive power of logics.

And, the structures involved in computation are finite.

Anuj Dawar February 2008 3

Model Theoretic Questions

The kind of questions we are interested in are about the expressive power of logics. Given a formula ϕ, its class of models is the collection of finite relational structures A in which it is true.

Mod(ϕ)= {A | A |= ϕ}

What classes of structures are definable in a given logic L?

How do syntactic restrictions on ϕ relate to semantic restrictions on Mod(ϕ)?

How does the computational complexity of Mod(ϕ) relate to the syntactic complexity of ϕ?

Anuj Dawar February 2008 4

Descriptive Complexity

A class of finite structures is definable in existential second-order logic if, and only if, it is decidable in NP. (Fagin)

A closs of ordered finite structures is definable in least fixed-point logic if, and only if, it is decidable in P. (Immerman; Vardi)

Open Question: Is there a logic that captures P without order?

Can model-theoretic methods cast light on questions of computational complexity?

Anuj Dawar February 2008 5

Compactness

The Compactness fails if we restrict ourselves to finite structures.

Let λn be the first order sentence.

λ = ∃x ... ∃x (x 6= x ) n 1 n ^ i j 1≤i

Then Λ= {λn | n ∈ ω} is a of sentences such that:

• every finite of Λ has a finite model

• Λ does not have a finite model.

Anuj Dawar February 2008 6

Completeness

Abstract Completeness Theorem The set of valid first order sentences is recursively enumerable.

Define the following sets:

Val = {ϕ | ϕ is valid on finite structures} Sat = {ϕ | ϕ is satisfiable in a finite structure}

then, clearly Sat is recursively enumerable, and Val is r.e. if, and only if, Sat is decidable.

Theorem (Trakhtenbrot 1950) The set of finitely satisfiable sentences is not decidable.

Anuj Dawar February 2008 7

Trakhtenbrot’s Theorem

The proof is by a reduction from the Halting problem.

Given a Turing machine M , we construct a first order sentence ϕM such that A |= ϕM if, and only if,

• there is a discrete linear order on the universe of A with minimal and maximal elements

• each of A (along with appropriate relations) encodes a configuration of the machine M

• the minimal element encodes the starting configuration of M on empty input

• for each element a of A the configuration encoded by its successor is the configuration obtained by M in one step starting from the configuration in a

• the configuration encoded by the maximal element of A is a halting configuration.

Anuj Dawar February 2008 8

Preservation

Preservation theorems for first-order logic provide a correspondence between syntactic and semantic restrictions.

A sentence ϕ is equivalent to an existential sentence if, and only if, the models of ϕ are closed under extensions. Łos-Tarski´

A sentence ϕ is equivalent to one that is positive in the relation R if, and only if, it is monotone in the relation R. Lyndon.

Anuj Dawar February 2008 9

Proving Preservation

In each of the cases, it is trivial to see that the syntactic restriction implies the semantic restriction.

The other direction, of expressive completeness, is usually proved using compactness.

For example, if ϕ is closed under extensions:

Take Φ to be the existential consequences of ϕ and show Φ |= ϕ by:

A |= Φ ∪ {ϕ}  A∗ ∩ B |= Φ ∪ {¬ϕ}  B∗

Anuj Dawar February 2008 10

Relativised Preservation

We are interested in relativisations of expressive completeness to classes of structures C:

If ϕ satisfies the semantic condition restricted to C, it is equivalent (on C) to a sentence in the restricted syntactic form.

If C satisfies compactness, then the preservation property necessarily holds in C.

Restricting the class C in this statement weakens both the hypothesis and the conclusion.

Both Łos-Tarski´ and Lyndon are known to fail when C is the class of all finite structures.

Anuj Dawar February 2008 11

Preservation under Extensions in the Finite

(Tait 1959) showed that there is a ϕ preserved under extensions on finite structures, but not equivalent to an existential sentence.

• Either ≤ is not a linear order;

• or R(x, z) for some x, y, z with x

• or R contains a cycle.

For any existential sentence whose finite models include all of the above, we can find a model that does not satisfy these conditions.

Anuj Dawar February 2008 12

Tools for Finite Model Theory

Besides compactness, completeness and preservation theorems, there are also examples showing that the finitary analogues of Craig Interpolation Theorem and the Beth Definability Theorem fail.

It seems that the class of finite structures is not well-behaved for the study of definability.

What tools and methods are available to study the expressive power of logic in the finite?

• Ehrenfeucht-Fra¨‘isse´ Games;

• Locality Theorems.

• Complexity

Anuj Dawar February 2008 13

Elementary Equivalence

On finite structures, the elementary equivalence relation is trivial:

A ≡ B if, and only if, A =∼ B

Given a structure A with n elements, we construct a sentence

ϕA = ∃x ... ∃x ψ ∧∀y y = x 1 n _ i 1≤i≤n

where, ψ(x1,...,xn) is the conjunction of all atomic and negated atomic formulas that hold in A.

Anuj Dawar February 2008 14

Theories vs. Sentences

First order logic can make all the distinctions that are there to be made between finite structures.

Any isomorphism closed class of finite structures S can be defined by a first-order theory: {¬ϕA | A 6∈ S}.

To understand the limits on the expressive power of first-order sentences, we need to consider coarser equivalence relations than ≡.

Anuj Dawar February 2008 15

Quantifier Rank

The quantifier rank of a formula ϕ, written qr(ϕ) is defined inductively as follows:

1. if ϕ is atomic then qr(ϕ) = 0,

2. if ϕ = ¬ψ then qr(ϕ)= qr(ψ),

3. if ϕ = ψ1 ∨ ψ2 or ϕ = ψ1 ∧ ψ2 then

qr(ϕ) = max(qr(ψ1), qr(ψ2)).

4. if ϕ = ∃xψ or ϕ = ∀xψ then qr(ϕ) = qr(ψ) + 1

Note: For the rest of this lecture, we assume that our signature consists only of relation and constant symbols.

With this proviso, it is easily proved that in a finite vocabulary, for each q, there are (up to logical equivalence) only finitely many sentences ϕ with qr(ϕ) ≤ q.

Anuj Dawar February 2008 16

Finitary Elementary Equivalence

For two structures A and B, we say A ≡p B if for any sentence ϕ with qr(ϕ) ≤ p, A |= ϕ if, and only if, B |= ϕ.

Key fact:

a class of structures S is definable by a first order sentence if, and only if,

S is closed under the relation ≡p for some p.

The equivalence relations ≡p can be characterised in terms of sequences of partial isomorphisms (Fra¨ısse´ 1954)

or two player games. (Ehrenfeucht 1961)

Anuj Dawar February 2008 17

Ehrenfeucht-Fra¨ısse´ Game

The p-round Ehrenfeucht game on structures A and B proceeds as follows:

• There are two players called Spoiler and Duplicator.

• At the ith round, Spoiler chooses one of the structures (say B) and one of the

elements of that structure (say bi).

• Duplicator must respond with an element of the other structure (say ai).

• If, after p rounds, the ai 7→ bi is a partial isomorphism, then Duplicator has won the game, otherwise Spoiler has won.

Theorem (Fra¨ısse´ 1954; Ehrenfeucht 1961) Duplicator has a strategy for winning the p-round Ehrenfeucht game on A and B

if, and only if, A ≡p B.

Anuj Dawar February 2008 18

Using Games

To show that a class of structures S is not definable in FO, we find, for every p,a

pair of structures Ap and Bp such that

• Ap ∈ S, Bp ∈ S; and

• Duplicator wins a p round game on Ap and Bp.

Example:

Cn—a cycle of length n.

Duplicator wins the p round game on C2p and C2p+1.

• 2-Colourability is not definable in FO.

• Even cardinality is not definable in FO.

Anuj Dawar February 2008 19

Linear Orders

Example:

Ln—a linear order of length n. for m, n ≥ 2p − 1,

Lm ≡p Ln

Duplicator’s strategy is to maintain the following condition after r rounds of the game:

for 1 ≤ i

• either length(ai, aj )= length(bi,bj)

p−r • or length(ai, aj), length(bi,bj) ≥ 2 − 1.

Evenness is not first order definable, even on linear orders.

The only first order definable sets of linear orders are the finite or co-finite ones.

Anuj Dawar February 2008 20

Connectivity

Consider the signature (E,<). and structures G =(V,E,<) in which E is a graph relation (i.e., an irreflexive, symmetric relation) and < is a linear order.

There is no first order sentence γ in this signature such that

G |= γ if, and only if, (V, E) is connected.

Note: The compactness-based that connectivity is undefinable leaves open the possibility that there is a sentence whose finite models are exactly the connected graphs. The above statement strengthens the argument in two ways.

Anuj Dawar February 2008 21

Connectivity

Suppose there was such a formula γ.

Let γ′ be the formula obtained by replacing every occurrence of E(x, y) in γ by the following y = x + 2∨ (x = max ∧y = min+1)∨ (y = min ∧x = max −1).

Then, ¬γ′ defines evenness on linear orders.

The above formula interprets a graph in the linear order that is connected if, and only if, the order is odd.

Anuj Dawar February 2008 22

Gaifman Graphs and Neighbourhoods

On a structure A, define the binary relation:

E(a1, a2) if, and only if, there is some relation R and some tuple a

containing both a1 and a2 with R(a).

The graph GA =(A, E) is called the Gaifman graph of A.

dist(a, b) — the distance between a and b in the graph (A, E).

A A Nbdr (a) — the substructure of given by the set:

{b | dist(a, b) ≤ r}

Anuj Dawar February 2008 23

Hanf Locality Theorem

We say A and B are Hanf equivalent with radius r and threshold q (A ≃r,q B) if, for every a ∈ A the two sets

′ A ∼ A ′ A ∼ B {a ∈ a | Nbdr (a) = Nbdr (a )} and {b ∈ B | Nbdr (a) = Nbdr (b)}

either have the same size or both have size greater than q;

and, similarly for every b ∈ B.

Theorem (Hanf) For every vocabulary σ and every p there are r ≤ 3p and q ≤ p such that for

any σ-structures A and B: if A ≃r,q B then A ≡p B.

p In other words, if r ≥ 3 , the equivalence relation ≃r,p is a refinement of ≡p.

Anuj Dawar February 2008 24

Hanf Locality

Duplicator’s strategy is to maintain the following condition:

After k moves, if a1,...,ak and b1,...,bk have been selected, then A B Nbd p−k (ai) =∼ Nbd p−k (bi) [ 3 [ 3 i i

If Spoiler plays on a within distance 2 · 3p−k−1 of a previously chosen point, play according to the isomorphism, otherwise, find b such that ∼ Nbd3p−k−1 (a) = Nbd3p−k−1 (b)

and b is not within distance 2 · 3p−k−1 of a previously chosen point.

Such a b is guaranteed by ≃r,p.

Anuj Dawar February 2008 25

Application

Hanf’s Locality Theorem can be used to show that graph connectivity is not definable by any sentence of existential monadic second-order logic.

That is, any sentence

∃S1,...,Smθ

where S1,...,Sm are set variables and θ is a first-order sentence.

Idea: For n sufficiently large, take

• C2n—a cycle of length 2n; and

• Cn ⊕ Cn the disjoint union of two cycles of length n.

For any colouring of C2n, we can find a colouring of Cn ⊕ Cn, so that the resulting coloured graphs are ≃r,p equivalent for arbitrary p.

Anuj Dawar February 2008 26

Gaifman’s Theorem

We write δ(x, y) >d for the formula of FO that says that the distance between x and y is greater than d.

We write ψN (x) to denote the formula obtained from ψ(x) by relativising all quantifiers to the set N .

A basic local sentence is a sentence of the form

 x  ∃x ···∃x δ(x , x ) > 2r ∧ ψNbdr ( i)(x ) 1 s ^ i j ^ i i=6 j i 

Theorem (Gaifman) Every first-order sentence is equivalent to a Boolean combination of basic local sentences.

Anuj Dawar February 2008 27

Complexity of First-Order Logic

Can we put bounds on the computational complexity of the class Mod(ϕ) for a first-order sentence ϕ.

What can we say about the complexity of the decision problem:

Given: a first-order formula ϕ and a structure A Decide: if A |= ϕ

Or, what is the complexity of the satisfaction relation for first-order logic?

This is usually called the model-checking problem for FO.

Anuj Dawar February 2008 28

Na¨ıve Algorithm

The straightforward algorithm proceeds recursively on the structure of ϕ:

• Atomic formulas by direct lookup.

• Boolean connectives are easy.

• If ϕ ≡∃x ψ then for each a ∈ A check whether

(A,c 7→ a) |= ψ[c/x],

where c is a new constant symbol.

This shows that the model-checking problem can be solved in time O(lnm) and O(m log n) space, where n is the size of A, l is the length of ϕ and m is the quantifier rank of ϕ (or by a more careful accounting, the number of distinct variables occurring in ϕ).

Anuj Dawar February 2008 29

Complexity

This shows that the model checking problem is in PSpace and for a fixed sentence ϕ, the problem of deciding membership in the class

Mod(ϕ)= {A | A |= ϕ}

is in logarithmic space and polynomial time.

QBF—satisfiability of quantified Boolean formulas can be easily reduced to the model checking problem with A a fixed two-element structure.

Thus, the problem is PSpace-complete, even for fixed A.

Anuj Dawar February 2008 30

Directions

• Consider richer logics than FO to be able to express more complex classes of structures. Manchester tutorial.

• Consider restricted classes of structures so that first-order satisfaction becomes tractable. Kreutzer talk.

• Is FO better-behaved on restricted classes of structures? Second talk.

Anuj Dawar February 2008