<<

06-26264 Reasoning The University of Birmingham Spring Semester 2019 School of Computer Science Volker Sorge 5 February, 2019 Handout 4 Summary of this handout: First-Order Logic — Syntax — Terms — Predicates — Quantifiers — Seman- tics — Substitution — Renaming — Ground Terms — — Skolemisation — Clause Normal Form

VII. First-Order Logic

While propositional logic can express some basic facts we want to reason about, it is not rich enough to express many aspects of intelligent discourse or reasoning. We can talk about facts only very broadly, as we have no means of restricting ourselves to talking about what or who these facts apply to. We can express “it is raining” with a , but we can not restrict it to a particular place like “it is raining in Birmingham” versus “it is not raining in London”. Moreover, we can not talk about general or abstract facts. E.g., a classic example which we can not yet express is:

Socrates is a man, all men are mortal, therefore Socrates is mortal.

In the following we will build the theoretical foundations for first-order predicate logic (FOL) that gives us a much more powerful instrument to reason.

VII.1 Syntax Since the main purpose of first-order logic is to allow us to make statements about distinct entities in a particular domain or universe of discourse, we first need some notation to represent this. 29. Constants represent concrete individuals in our universe of discourse. Example: 0, 1, 2, 3 or “Birming- ham, London” or “John, Paul, Mary” are constants. 30. Functions maps individuals in the universe to other individuals, thereby relating them to each other. Example: father(John) = Paul expresses the Paul is John’s father. Functions have different arities, e.g., the sin function has arity one, + has arity two and so on. 31. Variables are symbolic representations of an entity that is not (or not yet) determined. They are similar to the variables you know from programming languages or from mathematics, e.g., as in 2x2 +x+1 = 0. With these three components we now formally define a notion of terms:

Definition 20 (Terms). Let V, C, F be disjoint sets of symbols, representing variables, constants and functions, respectively. Then we define terms inductively as follows:

1. Any variable in V is a .

2. Any constant symbol in C is a term.

3. If t1, . . . , tn are terms and f ∈ F has arity n, then f(t1, . . . , tn) is a term. 4. Nothing else is an term.

Often constant, function and even variables have names that are in someway connected to their semantic meaning (e.g., we use digits for numbers or meaningful function names like sin). However, this is not strictly necessary, and in general, we will avoid confusing syntax and semantic, by using uniform naming conventions for

Constants to be c, d, e or c0, c1,...

24 Functions to be f, g, h or f0, f1,...

Variables to be x, y, z or x0, x1,... 32. Predicates In propositional logic the basic components were propositional variables that expressed the statements that could either be true or false. In first-order logic this role is played by predicates that can be applied to terms and are similarly statements that are either true or false. Thus, similar to functions, predicates have also an arity and we can view propositional variables as predicates of arity 0. We can then more formally define atomic formulas. For notation we will us capital roman letters, similar to our propositional variables. Normally, we will use P, Q, R, . . . or P0,P1,.... Definition 21 (Atomic Formulae). Let P be a set of symbols representing predicates.

(i) Let P ∈ P a predicate taking n arguments and t1, t2, . . . , tn are terms then P (t1, t2, . . . , tn) is an atomic formula or atom. (ii) Nothing else is an atomic formula. Example: We can now formalise that Socrates is mortal by M(s), where s denotes Socrates and M denotes that its argument is mortal. 33. Ground Terms and Atoms In the above definitions, both terms and hence atoms can contain variables. We say that a term is ground if it does not contain any variables; that is, it is generated from C, F, only. Similarly we say an atom (or later formula) is ground, if it contains only ground terms. 34. Free and bound variables So far in our terms and atoms variables would occur free that is they serve the role of a placeholder for some term by which they can be substituted. We call a variable bound if it either is either substituted by a specific value or restricted to a set of values. For example, the x in sin(x) is free, while in the following expression it is bound to a specific interval

0 ≤ x ≤ π : sin(x) ≥ 0

35. Quantifiers While one can think of different ways to bind variables to sets of values, we are primarily interested in two ways: binding a variable to all possible values or to at least on value. As an example consider the sentences For all x, x2 ≥ 0 and there exists x, x2 = 1. More formally we will use the quantifiers ∀ (for all) and ∃ (there exists):

∀x x2 ≥ 0 ∃x x2 = 1

Observe that in the quantifier notation we use the as a shorthand for parentheses. That is, we could have also written ∀x(x2 ≥ 0) and ∃x(x2 = 1). But since in our notation quantifiers will only ever be applied from the left, this notation will save us a lot of writing (and counting) of parentheses. 36. Scope of a quantifier is how far the “influence” of a quantifier reaches in our formula. This is given by the dot and parentheses notation. This point is best illustrated with an example:

scope of ∀x z }| { scope of ∀z z }| { ∀x (∃y P (x, y) ) ∨ ∀y Q(y, x) ∨ ∀z R(x, z) | {z } | {z } scope of ∃y scope of ∀y We can now finalise the definition of the full syntax of first-order logic.

25 Definition 22 (Well Formed Fomulae). We define the well formed formulae of first-order logic as follows: (i) > and ⊥ are well-formed formulae. (ii) Every atomic formula is a well-formed formula. (iii) If ϕ and ψ are well-formed formulas, then so are:

¬ϕ, ϕ ∧ ψ, ϕ ∨ ψ, ϕ → ψ, ϕ ↔ ψ

(iv) If ϕ is a well-formed formula and x ∈ V is a variable, then

∃x ϕ and ∀x ϕ

are well-formed formulae. (v) Nothing else is a well-formed formula. We will sometimes denote the set of well formed formulae in first-order logic as Wff(FOL), in particular, if we do not want to specify the underlying term set in detail. Example: This now allows us finally to simply denote our statement on Socrates formally as: (H(s) ∧ ∀x H(x) → M(x)) → M(s)

VII.2 Semantics In propositional logic we gave the propositional variables meaning by interpreting them into a set of truth values. We now do something very similar, however we have to start by interpreting terms first. This is done by mapping them into a universe, which is a non-empty set U. Apart from the non-emptiness requirement, we have no restrictions on U; that is, it can be finite of any size, or infinite countable (e.g., the natural numbers) or uncountable (e.g., the real numbers). We formally define an interpretation as follows: Definition 23 (Interpretation). Let U be a non-empty set called universe. An interpretation I over U is a function that (i) maps each c ∈ C to an element of U, (ii) maps each f ∈ F with n arguments to a concrete function f I : U n → U, i.e., the set of n-tuples over U to U, (iii) maps echo P ∈ P with n arguments to a concrete function P I : U n → {T, F}. Observe that we use the superscript notation to distinguish between the syntactic and semantic entities, and in particular that I(f) = f I and I(P ) = P I . The distinction between f and f I and between P and P I is important. The symbols f and P are just that: symbols. Whereas f I and P I denote a concrete function and a concrete relation on an interpretation I, respectively. Example: Let C = {c0, c1, c2}, F = {s}, P = {M}. Assume we have the following universe U = {Socrates, Zeus, Heracles}. We can now construct the following interpretation:

I(c0) = Socrates I(c1) = Zeus I(c2) = Heracles I(s) = Son I(M) = is mortal

Similar to propositional variables in an interpretation I an atomic formula is interpreted either as true, T, or as false, F. Also I is called a model of an atomic formula P (t1, . . . , tn), i.e., I |= P (t1, . . . , tn), I if and only if P (I(t1),...,I(tn)) = T for ground terms t1, . . . , tn. Example: Let C = {c}, F = {f}, and P = {P }; and given the atomic formulae P (c, c) P (c, f(c)) P (f(c), c)

One possible interpretation is for example into the universe of natural numbers, where U= IN0 and

26 1. I(c) = 0,

2. I(f) = s, where s is the successor function (for example, ++ in Java or incr in Ocaml), then f I (I(c)) = s(0) = 1.

3. I(P ) =≤, the usual less-or-equal, which we normally write in infix notation.

The interpretation of the three formulas is then

I(P (c, c)) = P I (I(c), I(c))= 0 ≤ 0 = T I(P (c, f(c))) = 0 ≤ 1 = T I(P (f(c), c)) = 1 ≤ 0 = F

Observe that = is not part of the syntax of our language! 37. Variables need to be treated special and can not be simply interpreted, as we have to account for all pos- sible interpretations into our universe. We therefore need a special function called a variable assignment or assignment for short.

Definition 24 (Variable Assignment). Given a universe U and a set of variables V. Then we call the function A : V → U a variable assignment or assignment. For a concrete variable x ∈ V we will denote an assignment under A by xA ∈ U i.e., A(x) = xA, analogously to our interpretation.

Example: [] Let P be a binary predicate and let U = {0, 1} be our universe. With variables x, y the exhaustive list of possible variable assignments are

• P (x, y) are P (0, 0),P (1, 0),P (0, 1),P (1, 1)

• P (x, x) are P (0, 0),P (1, 1)

The basic idea of an assignment is that we can vary it while keeping the interpretation constant. Con- sequently, we parameterise an interpretation I by a particular variable assignment A and usually write IA. With this we can now specify the full semantics of first-order logic starting with the semantics of terms.

Definition 25 (Semantics of Terms). Let U be a non-empty universe, I interpretation and A a variable assignment over U. Then we have

A (i) x ∈ V evaluates to IA(x) = A(x) = x ∈ U,

I (ii) c ∈ C evaluates to IA(c) = I(c) = c ∈ U,

(iii) with terms t1, . . . , tn and f ∈ F we evaluate IA(f(t1, . . . , tn)) recursively as

I I(f)(IA(t1),..., IA(tn)) = f (IA(t1),..., IA(tn)) ∈ U.

As we can now evaluate non-ground terms we can also fix the semantics of atomic formulae. In particular we can define our concept of model for first-order logic.

Definition 26 (Model). Let P (t1, . . . , tn) be an atomic formula, let I be an interpretation over a universe U and A a variable assignment over U. Then IA is called a model for P (t1, . . . , tn) if and only if I(P )(IA(t1),..., IA(tn)) = T. We write IA |= P. The semantics of well formed formulae is then effectively the natural extension of the semantics of propositional logic.

27 Definition 27 (Semantics of Formulae). Let I be an interpretation on over the universe U. Let ϕ, ψ ∈ Wff(FOL). Then we have

•I A(⊥) = F and IA(>) = T for all assignments A,

• Formulae ¬ϕ, ϕ ∧ ψ, ϕ ∨ ψ, ϕ → ψ, ϕ ↔ ψ are evaluated with respect to IA, similar to Def. 7.

•I A |= ∃x ϕ if and only if IA |= ϕ for some A that maps x to some value in U.

•I A |= ∀x ϕ if and only if IA |= ϕ for assignments A that map x to all possible values of U. Intuitively, the semantics of ∃ expresses that we can find one entity in our universe that, when plugged in for x, satisfies the formula, while the semantics ∀ is the idea that every entity of my universe has to satisfy the formula. This effectively uses the idea of substituting values for variables in a formula. We will look at that from a purely syntactic point of view in the next section. 38. Some Important Notions With the above machinery we now get the same notions of satisfiability and falsifiability, validity and unsatisfiability as in propositional logic by simply using the semantic interpretaion for first-order formu- las with a variable assignment IA. Therefore, we can analogously define tautology, contradiction and contingent as well as a conept of semantic consequence. The next step is obviously to regain a similar idea syntactic consequene by defining an appropriate calculus for first-order logic. We are already aware of the ND calculus. But we now want to lift the calculus to first-order as well.

VII.3 Normalforms Similar to propositional logic we are interested in putting our first-order formulae into some normal form. In particular, we want to regain clause normal form for effective reasoning algorithms. We do this via a sequence of transformation.

VII.3.1 Prenex Normal Form The idea of the prenex normal form is to move all quantifiers to the front of the formula. For example, the formula ∀x ϕ ∨ ψ is in prenex normal form, while the formula (∀x ϕ) ∨ ψ is not. While we can transform every formula Φ into a formula Φ0 that is in prenex normal form, such that Φ =∼ Φ0, care has to be taken with respect to the position and order of the quantifiers. 39. Order of Quantifiers matters in the general case. While ∀x ∀y P (x, y) is logically equivalent to ∀y ∀x P (x, y) (although it has a slightly different semantic meaning) and similarly for ∃, mixed quanti- fiers can usually not be commuted. For example, ∀x ∃y P (x, y) =6∼ ∃x ∀y P (x, y). We can convince ourselves very quickly of that fact. Take U = {0, 1}. Then the following is a model for ∀x ∃y P (x, y) but not for ∃x ∀y P (x, y):

P I (0, 0) = T,P I (0, 1) = F,P I (1, 0) = F,P I (1, 1) = T

40. Prenex normal form can be computed by using the following equivalences. Let ϕ, ψ ∈ Wff(FOL):

28 ¬∀x φ =∼ ∃x ¬φ (All-Neg) ¬∃x φ =∼ ∀x ¬φ (Ex-Neg)

(∀x ϕ) ∨ ψ =∼ ∀x ϕ ∨ ψ (All-Or-1) (∃x ϕ) ∨ ψ =∼ ∃x ϕ ∨ ψ (Ex-Or-1) (∀x ϕ) ∧ ψ =∼ ∀x ϕ ∧ ψ (All-And-1) (∃x ϕ) ∧ ψ =∼ ∃x ϕ ∧ ψ (Ex-And-1)

(∀x ϕ) → ψ =∼ ∃x ϕ → ψ (All-Imp-Left) (∃x ϕ) → ψ =∼ ∀x ϕ → ψ (Ex-Imp-Left) ϕ → (∀x ψ) =∼ ∀x ϕ → ψ (All-Imp-Right) ϕ → (∃x ψ) =∼ ∃x ϕ → ψ (Ex-Imp-Right)

(∀x ϕ) ∧ (∀x ψ) =∼ ∀x ϕ ∧ ψ (All-And-2) (∃x ϕ) ∨ (∃x ψ) =∼ ∃x ϕ ∨ ψ (Ex-Or-2) (∀x ϕ) ∨ (∀x ψ) =∼ ∀x ∀z ϕ ∨ ψ (All-Or-2) (∃x ϕ) ∧ (∃x ψ) =∼ ∃x ∃z ϕ ∧ ψ (Ex-And-2)

(∀x ϕ) ∧ (∃x ψ) =∼ ∀x ∃z ϕ ∧ ψ (All-Ex-And) (∃x ϕ) ∧ (∀x ψ) =∼ ∃x ∀z ϕ ∧ ψ (Ex-All-And) (∀x ϕ) ∨ (∃x ψ) =∼ ∀x ∃z ϕ ∨ ψ (All-Ex-Or) (∃x ϕ) ∨ (∀x ψ) =∼ ∃x ∀z ϕ ∨ ψ (Ex-All-Or)

For most of these equivalences it is quite straightforward to see that they hold, and will not go through all of them in detail and show them formally. But as an example, we convince ourselves that (Ex-Or-2) holds: Suppose IA |= (∃x ϕ) ∨ (∃x ψ) then IA |= ∃x ϕ or IA |= ∃x ψ. Without loss of generality we assume the former holds, then there is an assignment A(x), s.t., IA∪{xA} |= ϕ, and therefore also IA∪{xA} |= ϕ ∨ ψ, which implies IA |= ∃x ϕ ∨ ψ. On the other hand it is also easy to see that the dual statement for conjunction would not hold, by considering a counter example. Let P = {P,Q}, a universe U = {0, 1} and the interpretations P I (0) = T,P I (1) = F,QI (0) = F,QI (1) = T, then it is easy to see that the formula (∃x P (x)) ∧ (∃x Q(x)) holds while ∃x P (x) ∧ Q(x) does not. 41. Renaming of variables is important in the last six rules of the above table, in order to not lose soundness when dealing with mixed quantifications. Renaming of a variable x in a formula ϕ is achieved by simply replacing all occurrences of x in ϕ by a new variable name z. Note that we normally assume that V is a countably infinite set of variables, so we can always get a new variable if we need one. Example: Compute the prenex normal form for ¬∃y (∀x P (x) ∨ ∀x Q(x, y))

¬∃y (∀x P (x) ∨ ∀x Q(x, y)) =∼ ∀y ¬(∀x P (x) ∨ ∀x Q(x, y)) by (Ex-Neg) =∼ ∀y ¬∀x ∀z (P (x) ∨ Q(z, y)) by (All-Or-2) =∼ ∀y ∃x ¬∀z (P (x) ∨ Q(z, y)) by (All-Neg) =∼ ∀y ∃x ∃z ¬(P (x) ∨ Q(z, y)) by (All-Neg)

VII.3.2 Skolem Normal Form Now that we have moved our quantifiers to the front we can use our normal equivalences in order to compute the CNF. But that is not enough, we still need to get rid of the quantifiers altogether so we can use our resolution procedure. This is done with a procedure called Skolemisation. It allows us to replace quanified variables with free variables and new functions depending on those. The resulting formula, while not necessarily equivalent to the original one, has the property, that it is satisfiable if and only if the original one is satisfiable. We first need a concept of substitution. 42. Substitution Unlike renaming, which only replaced variables by newly named variables, substitutions allow us to replace variables in a formula by terms.

Definition 28 (Substitution). A substitution is a mapping of variables in V to terms over V, C, F. We usually denote substitutions with small Greek letters σ, τ, υ. An individual element of a substitution is denoted by [x 7→ t].

We call a substitution ground, if it maps variables to ground terms only. When a substitution is applied

29 to a wff, all occurrences of the variables involved are replaced by the respective terms the varialbes are mapped to. Example: Let ϕ = P (x, y, z) and σ = {[x 7→ c], [y 7→ f(c)]}. Then applying the substition yields

ϕσ = P (c, f(c), z)

43. Variable Capturing Careless application of substitions can give rise to a problem referred to as variable capturing. We must pay close attention to the names of already bound variables as not to cause interference. In particular, when substituting a term containing a variable into a quantified expression, and this variable is also the bound variable of the quantified expression, we must rename the bound variable first to avoid that the variable is captured. Consider the formula P (x) ∧ ∀y Q(x, y) and the substition [x 7→ f(y)]. If we carry out the substitution we get P (f(y)) ∧ ∀y Q(f(y), y). The first occurrence of y in Q has now been captured by the universal quantifier. To avoid this we have to first rename y and get P (f(y)) ∧ ∀z Q(f(y), z). 44. Skolemisation is the procedure of removing the quantifiers of the prenex normal form of a function. The basic idea is to turn all universally quantified variables into free variables, while substituting the existentially quantified variables by new functions that depend on the preceding universally quantified variables. More formally the procedure works as follows: Let ϕ be a formula in prenex normal form over V, F, C, P. Let S = {}. 1. if ϕ has no more quantifiers, return ϕ.

2. if ϕ = ∀x ϕ0 then

2.1 S = S ∪ {x} 2.2 ϕ = ϕ0 2.3 Go to 1.

3. if ϕ = ∃x ϕ0 then

3.1 if S = {} then ϕ = ϕ0[x 7→ c] with c new in C. 0 3.2 else ϕ = ϕ [x 7→ f(x1, . . . , xn)] with f new in F and S = {x1, . . . , xn}. 3.3 Go to 1.

During Skolemisation we effectively create new constants and functions. Functions are created overall already encountered universally quantified variables. Observe, that a formula in Skolem normal form is generally not equivalent to the original formula, however Skolemisation is satisfiability preserving. Example:

∃u ∀v ∃x ∀y ∃z ¬(P (x, u) ∨ Q(z, y, v)) =∼ ∀v ∃x ∀y ∃z ¬(P (x, c) ∨ Q(z, y, v)) =∼ ∃x ∀y ∃z ¬(P (x, c) ∨ Q(z, y, v)) =∼ ∀y ∃z ¬(P (f(v), c) ∨ Q(z, y, v)) =∼ ∃z ¬(P (f(v), c) ∨ Q(z, y, v)) =∼ ¬(P (f(v), c) ∨ Q(g(v, y), y, v))

VII.3.3 Clause Normal Form Finally we can get back to our original goal to construct a clause normal form for a first-order logic formula ϕ. This consists of two steps: 1. Put ϕ into its corresponding Skolem normal form ϕ0.

30 2. Compute the of ϕ0.

From the latter we can then again represent our formula as a set of clauses. Observe that one could also first compute a conjunctive normal form and then perform skolemisation. What is usually preferred is a mixed approach:

1. Substitute all implications and equivalences.

2. Put formula into prenex normal form.

3. Skolemise.

4. Compute CNF.

Example: The above formula ∃u ∀v ∃x ∀y ∃z ¬(P (x, u)∨Q(z, y, v)) would lead to the clause normal form {{¬P (f(v), c)}, {¬Q(g(v, y), y, v)}}.

Exercises for First-order Logic (unassessed):

The following exercises are unassessed and you do not have to hand in their solution. Nevertheless I strongly suggest that you attempt them anyway!

13. Consider the formula ∀x ∃y P (x, y) ∧ Q(y, x)

Give an interpretation into the universe U = {John, Paul, Mary, hates, loves}, where hates and loves are binary predicates.

14. Show the following equivalence:

(∀x ϕ) ∧ (∀x ψ) =∼ ∀x ϕ ∧ ψ

15. Prove or disprove: (∀x Q(x)) → ∀x P (x) =∼ ∃z ∀y Q(y) ∨ P (z)

16. Show that the following does not hold in general:

(∀x ϕ) ∨ (∀x ψ) =∼ ∀x ϕ ∨ ψ

17. Put the following formula into clause normal form:

(∀x ∃y P (x, y) ∨ Q(x, y)) → ∀x ∀y (∃z P (x, z)) ∨ (∃z Q(z, y) ∧ P (x, z))

31