DL PDF
DL PDF
DL PDF
Iliano Cervesato
Carnegie Mellon University
2010–2015
c Iliano Cervesato
1 What is Logic? 1
1.1 Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Valid Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.3 Formal Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
1.4 Types of Reasoning . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.5 Logic and Language . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.6 Which logic? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
1.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
I Propositional Logic 9
2 Propositional logic 11
2.1 Statements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
2.2 Propositional Connectives . . . . . . . . . . . . . . . . . . . . . . . 13
2.3 Turning Sentences into Logic . . . . . . . . . . . . . . . . . . . . . . 14
2.4 Propositional Formulas . . . . . . . . . . . . . . . . . . . . . . . . . 15
2.5 Valid Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
2.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3 Truth Tables 23
3.1 Evaluating Formulas . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.2 Interpretations and Models . . . . . . . . . . . . . . . . . . . . . . . 26
3.3 Tautologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27
3.4 Equivalences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 28
3.5 Valid Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29
3.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 31
4 Derivations 39
4.1 Elementary Derivations . . . . . . . . . . . . . . . . . . . . . . . . . 39
iii
iv CONTENTS
II Predicate Logic 51
5 Predicate Logic 53
5.1 Beyond Propositional Inferences . . . . . . . . . . . . . . . . . . . . 53
5.2 The Structure of Atomic Propositions . . . . . . . . . . . . . . . . . 54
5.3 Quantifiers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57
5.4 The Language of Predicate Logic . . . . . . . . . . . . . . . . . . . . 59
5.5 Turning Sentences into Predicate Logic . . . . . . . . . . . . . . . . 61
5.6 Valid Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 63
5.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6 First-Order Interpretations 71
6.1 Evaluating Elementary Formulas . . . . . . . . . . . . . . . . . . . . 71
6.2 Evaluating Propositional Connectives . . . . . . . . . . . . . . . . . 74
6.3 Evaluating Quantified Formulas . . . . . . . . . . . . . . . . . . . . 75
6.4 The Trouble with Infinity . . . . . . . . . . . . . . . . . . . . . . . . 76
6.5 Counterexamples and Witnesses . . . . . . . . . . . . . . . . . . . . 76
6.6 Validating Inferences . . . . . . . . . . . . . . . . . . . . . . . . . . 77
6.7 Model Checking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
6.8 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 79
8 Function Symbols 91
8.1 Indirect References . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
8.2 Using Function Symbols . . . . . . . . . . . . . . . . . . . . . . . . 93
8.3 The Language of First-Order Logic . . . . . . . . . . . . . . . . . . . 94
8.4 Interpretations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95
8.5 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
8.6 Soundness and Completeness . . . . . . . . . . . . . . . . . . . . . . 98
8.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 98
9 Equality 103
9.1 When are Two Things the Same? . . . . . . . . . . . . . . . . . . . . 103
9.2 First-order Equality . . . . . . . . . . . . . . . . . . . . . . . . . . . 104
9.3 Interpreting Equality . . . . . . . . . . . . . . . . . . . . . . . . . . 105
9.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 107
9.5 Definitions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109
9.6 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 112
10 Numbers 115
10.1 Logic with Natural Numbers . . . . . . . . . . . . . . . . . . . . . . 115
10.2 Inferences and Definitions . . . . . . . . . . . . . . . . . . . . . . . 116
10.3 Arithmetic Interpretations . . . . . . . . . . . . . . . . . . . . . . . . 117
10.4 Derivations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119
10.5 Soundness and . . . Completeness? . . . . . . . . . . . . . . . . . . . 120
10.6 Axiomatization of Arithmetic . . . . . . . . . . . . . . . . . . . . . . 122
10.7 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 123
12 Meta-Logic 131
12.1 Reasoning about Logic . . . . . . . . . . . . . . . . . . . . . . . . . 131
12.2 Logic in Logic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 131
12.3 Meta-Mathematics . . . . . . . . . . . . . . . . . . . . . . . . . . . 136
12.4 Gödel’s Incompleteness Theorems . . . . . . . . . . . . . . . . . . . 140
12.5 Exercises . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 142
Bibliography 147
Index 149
This document collects some course notes for a new experimental course introduced on
the Qatar Campus of Carnegie Mellon University in the Spring semester of 2010. Dis-
covering Logic (15-199) is an introduction to logic for computer science majors in their
freshman year. It targets students who have had little or no exposure to logic and has
the objective of preparing them for sophomore classes which require proficiency with
understanding formal statements expressed in English, elementary reasoning skills, and
a sense of mathematical rigor.
These notes currently cover just propositional and predicate logic, and they do so
at a motivational level only: they present some elementary concepts and techniques on
the basis of intuition, but shy away from technicism. In particular, these notes do not
contain proofs, except for a few simple examples introduced for motivational purpose.
The tone is intentionally lightweight and fun, without however compromising on rigor.
I. Cervesato
Doha, 28 April 2010
vii
viii P REFACE
What is Logic?
Logic is the study of reasoning. But what is reasoning? We can think about reasoning
in two ways, at least:1
• Reasoning is making sense of the reasons that justify an argument. This is anal-
ysis of the known to make sure it is correct. Then, logic is the study of what
counts as good reasons for what we know and learn, and why.
These two meanings are interconnected: the reasons why we can produce new knowl-
edge are what makes it correct. Both are well summarized in one of the earliest attempts
at defining logic, attributed to the first logician ever according to some: Aristotle (384–
322BC). In his (translated) words,
“New” gives the sense of learning, while “necessary” presupposes correctness. This
ancient definition, over 2,300 years old, will be the beginning of our discovery of logic.
1.1 Inferences
Consider the following sentence:
This is a convincing argument, isn’t it? Let’s dissect it. It has two parts separated by
the word “so”. This word, like many synonyms like “therefore”, ”consequently”, etc,
1 Like with all somewhat vague concepts, people have different opinions.
1
2 1.2. VALID I NFERENCES
is very important, so important that we will underline it, writing “so”: it separates the
known from the unknown, the old from the new. This word is an indication of reason.
Let’s look at what comes before and after it:
• The part before “so” lists the things we know already, here “Doha is in Qatar”
and “Sidra is in Doha”. They are called the premises. We use words like “and”
to separate them, and will sometimes underline them too.
• The part after “so” is the new thing that we learn when the premises are true,
here “Sidra is in Qatar”. It is called the conclusion.
Going from things that we know to things that we learn is called an inference. So,
“Doha is in Qatar and Sidra is in Doha, so she is in Qatar” is an inference. Inferences
have a number of premises (zero or more) and a single conclusion.
Reasoning is making inferences: learning new things from things that we previ-
ously knew using inferences such as the above. Logic is therefore the study of infer-
ences.
Doha is in Qatar and Sidra is in Doha, so she went for a walk. (1.2)
This inference does not ring right. What does Sidra’s walk have to do with her being
in Doha or Doha being in Qatar? In (1.1), the conclusion was a necessary consequence
of the premises: given that “Doha is in Qatar” and that “Sidra is in Doha”, there is no
way she can be in any country besides Qatar. Here it is not necessary: Sidra could very
well be in Doha, which is in Qatar, and watching television on a sofa — no walking
involved. This inference is not valid.
What is the difference between (1.1) and (1.2)? In the first case, the premises
were true and the conclusion had to be true: the conclusion was necessary given the
premises (ah! Aristotle again!). In the second example, premises and conclusions were
independent: the conclusion could be false even when the premises were true. Its truth
was not necessary given true premises. This gives us a basis for a tentative definition
of valid inference:
An inference is valid if it is not possible that its premises are true and
its conclusion is false.
Now, what happens when the premises are not true? Consider another inference:
This is clearly wrong . . . or is it? Well, if Doha were in France, then Sidra being in
Doha would mean that she is in France. The first premise, “Doha is in France”, is
clearly false (to the displeasure of the French, who could do with some hydrocarbons
reserves on their own soil), but if it were true, the conclusion would be true. So the
inference is valid.
This means that an inference can be valid even if one (or more) of its premises is
false. What matters for validity is that it is impossible that all the premises be true and
yet the conclusion be false (as in our last example). So our definition of valid inference
does hold water: we will keep it as what it means for an inference to be valid.
Logic is the study of valid inferences, specifically of what makes them valid and
what we can do with them.
Chances are that you have no idea whether Mikulat and Kiribati are real places, let
alone where they are located.2 Yet, this inference sounds alright, even if you have
never heard about these places, and therefore do not know anything about whether the
premises and conclusion are true or false. The reason it sounds alright is that it has
exactly the same shape as inferences (1.1) and (1.3), with just the names of the places
replaced in a consistent manner.
What we are discovering here is that we can tell whether an inference is valid even
if we don’t fully understand what it is about. Let’s take an even more extreme example:
We have no idea what and are, yet this sentence has again the same structure,
which makes it valid.3
We can indeed generalize the above inferences by abstracting away the name of the
places used in these sentences — logic is big on generalization and abstraction. We
obtain the following inference pattern:
which is valid no matter what we write for X and Y . Whatever we choose X and Y to
be, the conclusion cannot be false if the premises are true.
2 As a matter of fact, they do exist: I used Google Maps to pick the smallest village I could find in
Tanzania — that’s Mikulat, and then I went to Wikipedia to see if there was any country in the world that I
had never heard about — there were plenty and the island nation of Kiribati in the Southern Pacific was one
of the choices there.
3 In case you are wondering (but maybe you are not, which is good), is Japanese for Tokyo and
for Japan.
The above inferences are not valid because of what they say (which we can’t always
understand), but because of the way they look. What matters is the structure of these
sentences, not their meaning. Indeed, logic starts with the study of linguistic patterns
of valid reasoning, and abstracts them into mathematical symbols. Why? Because
the result is less ambiguous than natural language, ultimately simpler, and possibly
mechanizable. The word “logic” is often qualified to make this clear. We speak of
formal logic to indicate that we are interested in the form of an argument, not in its
contents. Going back to Sidra’s travels above, we pounded on that argument so
much that what it actually said didn’t matter any more, and in the end we just
looked at the form of the sentence.
symbolic logic to stress that language about facts and ideas is replaced with symbols.
Symbols represent facts and ideas, but logic manipulates just the symbols.
If the burglar had broken through the kitchen window, there would be foot- (1.7)
prints outside, but there are no footprints outsides, so the burglar didn’t
break in through the kitchen window.
Tom coughs a lot and has yellow stains on his fingers, so he is a smoker. (1.8)
This makes sense too! Note however that now the relationship between the premise
and the conclusion is quite different from our earlier examples. Here, the conclusion
is not so much a consequence of the premises (Tom may have a cold and touched
turmeric) but a probable explanation of the premises. This form of reasoning is called
inductive reasoning. Doctors, mechanics and lawyers use it all the time, and students
too, sometimes, when they are working on an assignment. For how widespread as it
may be, this book will not be concerned with inductive reasoning.
• Second, a same argument can be expressed in many different ways even in En-
glish. There are many sentences that say the same thing. Logic is more orderly.
• As any linguist will confirm, logic is a lot simpler than any natural language.
• Logic is not natural: not only are you taking a course on it, but your little brother
can do a lot of reasoning without having a clue of all the symbols we will be
introducing shortly.
By and large, logic has a place in the world, especially the world of Mathematics and
Computer Science. It will never replace language, not even to make inferences, but it
can help quite a bit.
This is the description of a basic fact that can be either true or false. No sweat here.
Well, let’s extend it a bit:
This introduces a spatial element. This means that this sentence can now participate in
reasoning about locations. This requires isolating linguistic patterns to deal with space,
which leads to a spatial logic. Let’s extend it further:
Now we have also a temporal element. To make inferences using this sentence involves
devising a logic that allows reasoning about time, a temporal logic. But we can do even
more:
Now, this sentence forces us to reason about people’s point of view: a sentence is not
universally true or false, but it depends on what each individual believes. Instilling
reasoning mechanisms to deal with beliefs yields what is called an epistemic logic.
This is already very complicated, and we have barely scratched the surface. Rea-
soning has many dimensions, and many many linguistic patterns can be isolated into a
logic. Which do we pick?
Rather than trying to study all possible linguistic patterns at once, logic tends to
proceed using a divide-and-conquer strategy:4
Therefore, there is not one logic, but many. Not only this, but logic as a field is very
alive, with dozens of conferences on it every year.
In this book, we will look at two such logics: propositional logic and predicate
logic. Both belong to the first category above, the one that looks into “very simple,
very common linguistic patterns”. Indeed,
• they are very well understood (but remember that logic is alive: many logicians
are working on understanding them even better as you read this sentence!),
4 This is a bit of a simplification, but not too far from the way logicians actually do things.
This last justification is puzzling to say the least: if predicate logic is so powerful,
why bother studying all those more complicated spatial, temporal, epistemic, whatever
logics? Great questions! Here are a couple of answers:
1. For the same reason we don’t normally program in assembly language. Predicate
logic is very low level and it is often convenient to work at a higher level of
abstraction.
2. To get better automation. Predicate logic at large is not easy to automate (we
will see what this means), while more specialized logics behave better.
3. To look for more fundamental concepts. If we compare reasoning to physics,
predicate logic gets us to the level of the atoms: this is great because we can use
it to describe all the objects around us. However, atoms are themselves built out
of smaller entities, like electrons and neutrons, which are in turn composed of
quarks and other particles. The same happens with reasoning: predicate logic can
itself be understood on the basis of smaller logical particles, which also explain
a lot of other forms of reasoning.
1.7 Exercises
1. Is the following a valid inference?
To get an A in this class I need to get at least 90% overall, I can get
90% in each homework and 95% in every other thing that is graded.
Therefore I will get an A in this class.
What are the premises and conclusions? What kind of dimension to you need to
work with on top of a core logic about truth and falsehood to determine if it is
valid?
2. We defined an inference as having zero or more premises and one conclusion.
What can you say of an inference with zero premises? Can you think of an
example?
3. For this exercise, you will need to read the book “Logicomix: an Epic Search for
Truth” [4] (yes, the entire book) and write a letter to a friend about it (or if your
prefer a review for Amazon.com).
The first part of your letter should describe the book to your friend. Tell him/her
what the story is about, how the book is structured, what the main characters,
the main themes, the historical context are, etc. You are encouraged to expand
on this list — the more creative you are the better! For example, does some
character remind you of somebody you know?
5 Specifically, the form of predicate logic known as second-order logic. It is really cool, but a bit beyond
In the second part of the letter, share your personal impressions with your friend.
What parts did you like best? Why? Which did you dislike? Why? Which
aroused your curiosity? Why? Make sure to motivate your arguments: writing
just “I liked it” won’t cut it.
In the third part, formulate at least three questions that you would like to see
answered in this course. They can be about anything you expect the course to be
about or anything in the book that aroused your curiosity. The suggested length
of your letter is between 3 and 5 pages. It’s about quality, not quantity!
Here are some evaluation criteria you should aim toward as you write your letter:
• Form: Your letter is expected to be grammatical and to develop a sensible
argument. Your sentences and paragraphs should flow into each other with-
out any rough edges. Your ideas should be developed progressively. You
are welcome to use quotes and citations as long as you reference them.
• Completeness: Your letter should do a good job at describing the book to
your friend so that he/she can form a well-informed opinion about whether
he/she wants to read it. There are a lot of aspects that are clearly important:
you are expected to cover them all.
• Creativity: Do not just state the obvious. Your letter should provoke think-
ing and curiosity, and it should be pleasant to read.
Propositional Logic
9
Chapter 2
Propositional logic
Having established that logic isolates common linguistic patterns used in reasoning, we
begin our investigation with some of the most elementary patterns, and in a sense the
most profound. It took logicians several thousands of years to properly identify them
and understand how they work. This is the language of propositional logic. In this
chapter, we introduce it and see how we can use it to express sentences and inferences.
We will see how to make sure those inferences are valid in the next chapters.
2.1 Statements
Recall what we are trying to do: reason about when inferences are valid. We have
defined an inference to consist of zero or more premises and one conclusion, and for
an inference to be valid, it must not be possible that all premises be true and yet the
conclusion be false. This suggests that we need to look at phrases that can be either true
or false. Sentences that can be either true or false are called statements or propositions.
When we speak or write, we use statements all the time. Note however that we also
say things that are not statements. For example, a question is neither true nor false: it
is just a question. The following table shows some examples in each category.
11
12 2.1. S TATEMENTS
Statements: Non-statements:
If we look carefully at the left column, we see that some statements are simpler
than others.
• Take “Sidra is in Doha”. Clearly, it can be either true or false, but no part
of it taken by itself is a sentence that is true or false. Indeed, removing any
word yields an ungrammatical phrase. Statements like this are called atomic or
elementary. Other atomic propositions in our list are “Doha is in France”, “The
sun shines” and “10 < 7”.
The last two examples are also composite statements, but in a different way:
they don’t simply combine smaller statements into bigger statements, but they
generalize them somehow.
In language, we have ways to combine basic statements that are true or false into bigger
statements that are also true or false. There are a lot of ways to do so. Propositional
logic looks at a couple of very simple ways statements are formed from atomic propo-
sitions that come up all the time. It focuses on simple connectives like “and”. Some
common forms of generalization will be handled by predicate logic in Chapter 5.
• “not” (also encountered as “it is not the case that . . . ” and “. . . is false”). This
connective, which applies to single formulas — it is unary, is called negation
and is typically written ¬ prefix.
• “implies” and its synonyms like “if . . . , then . . . ”, “therefore”, “only if”, “. . . is
a necessary condition for . . . ”, “. . . is a sufficient condition for . . . ”, and many
many others. This binary connective is known to logicians the world around as
implication and they write it infix as → (some write ⊃, but we won’t use this
notation).
All these synonyms for “implies” are confusing, aren’t they? Take “only if”:
does it mean “implies”? does it mean “is implied by”? And what about “if”
in the middle of a sentence? Not to say anything about those necessary and
sufficient conditions math books are so fond of. Let’s make sense of all this.
– Consider the phrase “The sun shines if it is noon”. What it says is that
whenever it is noon, the sun is out there shining, which is the same as “If
it is noon, then the sun shines” or “It is noon implies that the sun shines”.
Another way to look at it is that if we know that it is noon, then the sun
must be shining: it being noon is a sufficient condition for the sun to shine.
Note that these phrases do not exclude that the sun shines at other times,
for example at 2pm: all they say is that at noon we are guaranteed that it
shines.
– Next, take the phrase “The sun shines only if it is noon”. This time, the
meaning is that it’d better be noon for the sun to shine, i.e., the sun won’t
be shining unless it’s noon. We can express this as “If the sun shines, then
it is noon” and “The sun shines implies it is noon”. That’s the opposite
of the previous example — the difference one little word can make! Note
that this phrase does not guarantee that the sun will be shining at noon —
• “if and only if” is known as biimplication and is written infix using the symbol
↔. Biimplication looks like the combination of an “if” and an “only if”. In fact,
the phrase “The sun shines if and only if it is noon” means that the sun shines
exactly when it is noon, always. It being noon is a necessary and sufficient
condition for the sun to shine.
Although not properly connectives, it is useful to introduce notation for the statement
that is always true and the statement that is always false:
We immediately recognize “and” as a linguistic pattern for conjunction, while the two
sentences on either side are just atomic. We can then rewrite it as follows in proposi-
tional logic:
As we do this, we will often expand pronouns, rewriting the contents of the second box
as “Sidra is paying attention”, for example.
Let’s do the same not with a sentence, but with a whole inference. Consider again
inference (1.7) from last chapter:
If the burglar had broken through the kitchen window, there would be foot-
prints outside, but there are no footprints outsides, so the burglar didn’t
break in through the kitchen window.
Here, “If . . . then . . . ” is a linguistic pattern for implication, and there are a couple of
negations. This gets us the following translation in propositional logic:
What about the word “and”? Isn’t it a conjunction? Here we have a choice: we can see
it either as a way to separate the various premises of an inference (that’s what we did
just now) or we can see it as a connective, in which case we would get the following
inference:
( the burglar had broken in through the kitchen window
→ there would be footprints outside ) ∧ ¬ there were footprints outsides ,
so ¬ the burglar broke in through the kitchen window .
As we will see, the two translations are equivalent in a sense that we will make precise
in chapters to come.
So far, we have replaced the linguistic patterns for conjunction, disjunction, etc,
with propositional connectives, leaving the atomic statements as English sentences.
This quickly becomes a pain: they are pretty long and, as you can notice, they contain
small linguistic variations among sentences meant to represent the same thing (e.g.,
“there would be footprints outside” and “there were footprints outside”). Logicians,
being the practical types they are (sometimes), have introduced abbreviations. Rather
than using those long sentences, they like to represent statements with short strings,
often a single letter. In the first example, they could define
C = “Sidra is in class”
A = “Sidra is paying attention”
This is sometimes called a dictionary or a reading key, and C and A are called propo-
sitional letters. Then, the sentence in the first example becomes
C∧A
Notice that, by introducing propositional letters as abbreviations, we have completely
eliminated the contents of the formula, focusing uniquely on its form. This becomes
even more useful when looking at full inferences. Consider our second example with
the following dictionary:
B = “the burglar broke in through the kitchen window”
F = “there were footprints outside”
Then (our second reading of) that inference becomes
(B → F ) ∧ ¬F so ¬B
This is so much shorter!!! This is the kind of abstract inferences we will look at from
now on.
like, not what they mean). We will write a generic propositional formula using the
Greek letters ϕ (“phi”, pronounced “fAi”) and ψ (“psi”, pronounced “sAi”), possibly
annotated with subscripts and primes — that will help us keep things straight.
GF BC
ED
Connectives
1. ¬ 2. ∧, ∨ 3. → 4. ↔
This means that ¬ has the highest precedence and ↔ the lowest, so that A → A ∨ B
is understood as (A → (A ∨ B)) while A → A ↔ B stands for ((A → A) ↔ B).
This allows us to simplify the above formula as
A ∨ B ↔ ¬(¬A ∧ ¬B)
Now, only the parentheses around the conjunction are needed, otherwise the ¬ before
them would cling to the formula ¬A.
The second widely adopted convention is to consider conjunction and disjunction
to be associative, so that we can write A ∨ B ∨ C instead of either ((A ∨ B) ∨ C) or
(A ∨ (B ∨ C)).
Let’s practice turning statements into formulas. This time, we will skip the boxes
and also consider slightly more complicated examples. As a warm up, consider the
following inference:
If chicken is on the menu, you shouldn’t order fish, but you should have (2.1)
either fish or salad, so if chicken is on the menu you should have salad.
Its premise and conclusion are composite statements. Let us define a dictionary that
associates propositional letters with the elementary statements therein:
C = “chicken is on the menu”
F = “you should order fish”
S = “you should order salad”
Then, the above inference translates to the following simple form:
(C → ¬F ) ∧ (F ∨ S) so C → S
As in the burglar’s example, we could also have interpreted the word “but” as the
separator between two premises rather than as a conjunction.
With all we know, we can finally write Sidra’s travel inference logically. Here it is
again:
Doha is in Qatar and Sidra is in Doha, so she is in Qatar. (2.2)
To express it in propositional logic, it is convenient to rewrite it slightly into:
If Sidra is in Doha then she is in Qatar and Sidra is in Doha, (2.3)
so she is in Qatar.
Now, the implication is clearly visible. Then, it doesn’t take much of a dictionary to
produce the propositional inference
D → Q and D so Q
As our last example, let’s try to make logical sense of a simple (and not very useful)
computer program. Here we use the C/Java syntax, where && and || indicates a
Boolean conjunction and disjunction, respectively.
if (x < 7 || (1 <= y && y <= 5)) {
if (x < 7)
y >= 6
} else y > 0
Observe that 1 <= y is the opposite of y > 0, and similarly for y <= 5 and y >= 6
(if y is an integer). Then, we can take our dictionary to be
x7 = x < 7
y0 = y > 0
y5 = y < 5
Translating “if . . . then . . . ” is easy, but what about the “else” part? Well, the “else”
part should apply exactly when the condition is false, so that “if c then X else Y ” has
the exact same meaning as “if c then X, and if not c then Y ”. This gives us the key to
translating the above program into logic. We obtain
(x7 ∨ (¬y0 ∧ y5 ) → (x7 → ¬y5 ))
∧ (¬(x7 ∨ (¬y0 ∧ y5 )) → y0 )
• There are infinitely many valid inferences. For example, if you think about it,
the inference “A1 ∧ . . . ∧ An so Ai ” is valid for any positive value of n and for
every 1 ≤ i ≤ n. So, the very idea of building this dictionary is hopeless.
• Even if it were somehow possible, this does not address the issue of how we
convince ourselves whether an inference is valid or not. How do we decide
which inferences to insert in the dictionary?
1. If we have a way to determine whether formulas are true or false based on the
truth or falsehood of the propositional letters in them, then the above definition
gives us an easy way to check whether an inference is valid: that’s exactly when
it cannot be that all the premises are true but the conclusion is false. This is what
the truth tables will allow us to do in Chapter 3.
2.6 Exercises
1. Consider the following passage:
If it got dark because the sun was setting, it would be getting cool, but
instead it is still hot even though the there was no more light.
• Identify the atomic statements [Hint: there are just 3 of them] and define a
dictionary that allows you to abbreviate them into propositional letters.
• Identify the connectives and, using your dictionary, express the passage in
propositional logic.
• Is this an inference? Is it valid?
(b) The knight will win only if the horse is fresh and the armor is strong.
(c) Either a fresh horse or a strong armor is a necessary condition for the knight
to win.
(d) The knight will win if and only if the armor is strong but the horse is not
fresh.
(e) Neither the horse being fresh nor the armor being strong alone is a sufficient
condition for the knight to win, but if both the horse is fresh and the armor
is strong, then the knight shall win.
5. Same exercise again, but this time the sentences are in Italian. All you need to
know is that “e” means “and”, that “o” means “or”, that “se” means “if”, that
“allora” means “then”, that “solo se” means “only if”, and finally that “non”
means “not”.
(a) Se Anita vince le elezioni, allora le tasse verranno ridotte.
(b) Le tasse verranno ridotte solo se Anita vince le elezioni e l’economia ri-
mane forte.
(c) Le tasse verrano ridotte se Anita non vince le elezioni.
(d) L’economia rimane forte se e solo se Anita vince le elezioni o le tasse ver-
ranno ridotte.
(e) Se l’economia non rimane forte e le tasse non verranno ridotte, allora Anita
vince le elezioni.
6. Let’s translate some more sentences in propositional logic. This time the sen-
tences will be in Chinese. Chinese??? All you need to know to do this exercise
is that “ ” means “and”, that “ ” means “or”, that “ ” means “if”, that “ ”
means “then”, that “ ” means “only if”, and finally that “ ” means “not”.
(a)
(b)
(c)
(d)
Write the formula corresponding to each of these sentences. You don’t need to
say which character strings correspond to what propositional letter: just write
the result. If you really want to show the mapping, use cut-and-paste to write the
Chinese characters (unless you can type in Chinese, that is).
7. Let A, B and C be the following atomic propositions:
• A = “Roses are red”
• B = “Violets are blue”
• C = “Sugar is sweet”
Rewrite the following atomic formulas in English:
8. Consider the following programming problem: you are given two arrays arrA
and arrB, their elements are sorted in ascending order, and you want to merge
them into a third array, arrC, whose elements are also ordered. How would you
complete the condition of the if statement in the code snippet below to get the
correct result? The atomic conditions you can work with are “i<arrA.length”,
“j<arrB.length” and “arrA[i]<arrB[j]”. You can assume that arrC
has already been created for you and that arrA and arrB do not have common
elements.
int i=0;
int j=0;
int k=0;
for (; (i<arrA.length)||(j<arrB.length);) {
if (...) {
arrC[k] = arrA[i];
i++;
} else {
arrC[k] = arrB[j];
j++;
}
k++;
}
If this is easier for you, feel free to define an auxiliary variable for your test
condition and use it in the “if”. (Macho programmers do not define auxiliary
variables, but nobody is able to read their code — good programmers do a lot
of things that macho programmers don’t!) If you find a better way to write this
loop, feel free to include it in your answer.
9. One daily activity that allows you to use propositional logic is searching the
Internet. Surprised? Look it up! Go to your favorite search engine and look for
how it supports using Boolean connectives to guide the search. Write a short
essay that lists what connectives are available in that search engine, gives a few
examples of queries that use them, and compare the results. Does this engine
support other mechanisms besides Boolean operators to guide the search? As
usual with essays, go beyond the obvious and help me learn something I didn’t
know. For example, some of you may want to write about a search engine that is
not Google, or about Google searches that are not about web pages. Your essay
Truth Tables
Being a statement, it is either true or false, and so are the atomic statements “Sidra is
in class” (A) and “the sun is shining” (B) that appear in it. Intuitively, the truth of the
composite sentence depends on the truth of its elementary propositions. Let’s see:
• The overall sentence, A ∧ B, is certainly true if Sidra is sitting on her seat and
sun rays are basking the floor of the classroom, that is A ∧ B is true if A is true
and at the same time B is true.
• If she is in her seat and it rains outside, A is true but B is false. At the same time,
the overall sentence A ∧ B is false: it is not the case that “Sidra is in class and
the sun is shining”.
• The overall sentence is similarly false when Sidra is sick and it is sunny, that is
A is false but B is true.
23
24 3.1. E VALUATING F ORMULAS
• Finally, A ∧ B is undoubtedly false on those rainy days where Sidra is sick, i.e.,
when both A and B are false.
In all cases, we could determine the truth or falsehood of A ∧ B on the basis of the
truth or falsehood of A and of B. If we think about “true” and “false” as values, we
have determined the value of A ∧ B in pretty much the same way as we determine the
value of x + y knowing the value of x and the value of y. Just like +, ∧ is an operator.
Given values for its arguments, it allows us to calculate the value of the conjunction.
Conjunctions and the other propositional connectives operate on values “true” and
“false”, which we will abbreviate as T and F , respectively. They are called truth values
for obvious reasons. They differ from the numerical values in that there are just two
of them, rather than infinitely many. This makes it easy to describe how each Boolean
connective operates: for example we can build a “conjunction table” similar to the
addition table that we have learned in elementary school:
A
B\ T F
T T F
F F F
It is however convenient to format such tables slightly differently: we use one row
for each of the four possible combinations of the truth values for A and B, and list
the corresponding value of their conjunction next to it on that row. What we obtain in
this way is what is known as the truth table of conjunction. While we are at it, it is
convenient to add another column for disjunction, another for implication, and so on
for all other propositional connectives. We obtain the following composite truth table
for all connectives:
All are pretty intuitive, except for →. Why should it be true when the antecedent is
false? We will see this in a minute. We have not displayed the trivial columns for >
and ⊥, which are always T and always F , respectively. Truth tables, pretty much in the
format just shown, were devised by the logician Ludwig Wittgenstein (1889–1951).
In mathematics, once we know how to do additions and multiplications, it is easy
to calculate the value of a polynomial, say −2x2 + 3x − 1, for any given value of
the variable x: for instance, for x = 5 this polynomial has value −36. In the same
way, since truth tables give us a way to “do” conjunctions, disjunction, etc, we can
calculate the truth value of a complex propositional formula for any truth value of its
propositional letters. For example, take the formula ¬(A ∨ ¬B) and let’s calculate its
truth value when A is T and B is F .
T F
⇓ ⇓
¬ ( A ∨ ¬ B )
| {z }
T
| {z }
T
| {z }
F
The truth table for ¬ tells us that ¬B evaluates to T when B is F . Then, the truth
table for ∨ gives us a way to calculate that the truth value of A ∨ ¬B is T since both
arguments are T . Finally, the truth table for ¬ gives us the final result, F , because its
argument has value T .
Plotting the values of the polynomial −2x2 + 3x − 1 on a graph as the value of x
varies gives us an overall picture. We can do the same with any formula ϕ: determine
its truth value for all the possible combinations of the truth values of its propositional
letters. That’s the truth table of ϕ. For example, take again the formula ¬(A ∨ ¬B)
and let’s determine its truth value for all possible truth values of the atomic propositions
A and B. We obtain the following table, with the overall result displayed in bold:
A B ¬ (A ∨ ¬ B)
T T F T T F T
T F F T T T F
F T T F F F T
F F F F T T F
0 0 4 1 3 2 1
This is the truth table of the formula ¬(A ∨ ¬B). To understand the way we built this
table, follow the indices at the bottom of every column:
0. We started by just listing all possible combinations of values for A and B under
the two leftmost columns.
1. Then, we copied the A-column under every occurrence of A in the formula, and
similarly for B.
2. The only subformula we could calculate knowing just the values of A and of B
is ¬B. We used the truth table for ¬ to compute it and wrote the result under that
instance of ¬.
3. Knowing the truth values of A and ¬B, we used again the truth tables, this time
for disjunction, to determine the truth value of A ∨ ¬B, and wrote the result
under ∨.
4. Finally, we used again the truth table for negation and computed the possible
truth values of ¬(A ∨ ¬B) on the basis of the truth values for A ∨ ¬B we just
obtained.
This table corresponds to the graph of the polynomial since it gives us an overall view
of the behavior of this propositional formula as A and B vary.
Clearly, we can apply this technique to any formula. If it has just two propositional
letters, we proceed pretty much as we just did, simply applying the truth table for the
appropriate connective at each step. If it has three, say A, B and C, then we will
need 8 = 23 rows to account for all the possible combinations of their truth values. In
general, if we have n letters we get 2n rows.
M |= ϕ
if ϕ has value T in the row of the truth table corresponding to M. Then, this world is
called a model for ϕ. For example, consider the world M = {A 7→ F, B 7→ T }, then
M |= ¬(A ∨ ¬B) because in the third row of the above table this formula is true. We
sometimes write M 6|= ϕ for a world M that is not a model for ϕ, i.e., if ϕ has value
F in the row corresponding to M. Consider, for example, the second row in the above
table: it says that {A 7→ T, B 7→ F } 6|= ¬(A ∨ ¬B). We can indeed summarize the
truth table for ¬(A ∨ ¬B) as follows:
Notice that each row of the truth table for a formula is independent from the other rows:
to check whether a given world is a model for a formula, we do not need to generate
the whole table — the row corresponding to that world is enough. This is good when
we know exactly what world we are in: even if the formula is very big, this gives us a
fast and easy way to calculate the truth value of the formula.
Before going back to valid inferences, let’s look at implication again. What do we
mean by “A implies B” (or “if A, then B”)? Well, we mean that it should never be
the case that A is true and B is false. But isn’t that what the formula ¬(A ∧ ¬B)
expresses?
Indeed, A → B and ¬(A ∧ ¬B) should be equivalent: for every given values of
A and B (i.e, for every interpretation), they should have the same truth value. That is,
they should have the same truth table. Let’s build the truth table of ¬(A ∧ ¬B):
A B ¬ (A ∧ ¬ B)
T T T T F F T
T F F T T T F
F T T F F F T
F F T F F T F
And indeed! That’s exactly the truth table of A → B. This explains why implication
has this strange truth table. In truth, there are other explanations of why things work
this way.
3.3 Tautologies
One more thing. Some formulas are always true no matter what the truth values of the
propositional letters in them. For example here is a very simple one:
A A ∨ ¬ A
T T T F T
F F T T F
neither tautologies nor contradictions, the simplest being the propositional letter A by
itself.
To check that a formula is a tautology (or a contradiction), we have to build its
entire truth table — all the rows! That can be very time consuming: if the formula
contains n propositional letters, its truth table will have 2n rows. On the other hand,
the truth value of each row can be calculated independently from the other rows, so
the work of checking that a formula is tautology could be done in parallel: we could
recruit 2n people, give each of them one of the interpretations, ask each to compute
the truth value of the formula for his/her interpretation, and finally check whether all
values they came up with are T .
3.4 Equivalences
One particularly useful type of tautologies are the ones with ↔ as their main connec-
tive. When ϕ ↔ ψ is a tautology, then ϕ and ψ always have the same truth value: they
are interchangeable. The formulas ϕ and ψ are equivalent in the sense of Section 3.2.
For example, consider the formula A ∨ B ↔ ¬(¬A ∧ ¬B). It has the following
truth table:
A B A ∨ B ↔ ¬ (¬ A ∧ ¬ B)
T T T T T T T F T F F T
T F T T F T T F T F T F
F T F T T T T T F F F T
F F F F F T F T F T T F
• [Unit] A∧>↔A
• [Neutral] A∧⊥↔⊥
• [Commutativity] A ∧ B ↔ B ∧ A
• [Associativity] A ∧ (B ∧ C) ↔ (A ∧ B) ∧ C
• [Distributivity] A ∧ (B ∨ C) ↔ (A ∨ C) ∧ (A ∨ C)
• [Idempotence] A∧A↔A
The first says that ∧ and > behave a little bit like multiplication and 1: 1 is the identity
element of multiplication, so that x × 1 = x for any value of x. The second says that
⊥ is the neutral element of ∧, and it reminds us of x × 0 = 0. The next three are
self-explanatory, while the last two have no counterparts in arithmetic.
You may want to verify that these formulas are indeed tautologies by writing their
truth table. An even more interesting fact is that if we take the exact same formulas,
but switch ∧ and ∨ on the one hand, and > and ⊥ on the other, we obtain another
set of equivalences. This is an instance of a phenomenon called duality. Of course,
there are many other equivalences. One that is particularly important is that negating a
proposition twice is the same as doing nothing:
The first mathematician to notice that the propositional connectives had algebraic
properties similar to (but not quite the same as) arithmetic was George Boole (1815–
1864). Algebras whose operators have these same properties are called Boolean alge-
bras. Other examples include sets and their operations, which was noticed diagram-
matically by John Venn (1834–1923), and digital circuits, whose gates were shown to
behave like connectives by Claude Shannon (1916–2001)1 .
Notice that we have stated all these equivalences based on propositional letters (A,
B and C). They remain equivalences whatever formula we use instead of these letters
(as long as every occurrence of A is replaced by the same propositional formula, and
so on). For example, the commutativity of ∧ is generalized to the equivalence
ϕ∧ψ↔ψ∧ϕ
where ϕ and ψ stand for any formula we wish. This a general property of tautologies:
we can substitute arbitrary formulas for the propositional letters appearing in them and
they remain tautologies.
As a final note, observe that, to determine that two formulas are equivalent, one has
to build the entire truth table of both. That’s again a lot of work, although it can be
done in parallel.
in 1937. There, he demonstrated that electrical circuits built out of gates that behave like the Boolean con-
nectives could represent and carry out all common logical and numerical computation. It has been claimed
to be the most important master’s thesis of all times.
that an inference is valid: write the truth table for all the premises and the conclusion,
and check that on every row where all premise are true, also the conclusion is true.
Here’s an example: we want to verify that Sidra’s inference is valid, that is that “A
and A → B so B” is valid. Here is the combined truth table, where we have removed
the explicit columns for A and B on the left (they are there anyway under the various
occurrences of A and B) and highlighted the rows where all premises are true.
A A → B B
I T T T T T X
T T F F F
F F T T T
F F T F F
There is exactly one row where all premises are true, the first row, and on this row, the
conclusion is also true. Therefore we can conclude that this inference is valid. All the
other rows have some premise that is false, and therefore we don’t need to care about
what the truth value of the conclusion is: all that matters is what happens when all the
premises are true. Observe that the rows of the table still correspond to all possible
interpretations with respect to A and B: as we are building the above table, we are
examining all possible situations. In particular, the interpretation of each propositional
letter remains constant on each row.
Logicians have a special notation for denoting inferences that are valid in the sense
we just examined. Rather than writing “If ϕ1 and . . . and ϕn so ψ is a valid inference”,
they write
ϕ1 , . . . , ϕn |= ψ
Here, we have that A, A → B |= B.2
Now, it is interesting to look at what happens when an inference is not valid. Take
for example “¬A and A → B so ¬B” (isn’t it actually valid?) and let’s go through the
same process:
¬ A A → B ¬ B
F T T T T F T
F T T F F T F
I T F F T T F T !!!
I T F F T F T F X
This time, there are two rows where all premises are true: the third and the fourth are
highlighted. In the fourth, the conclusion is true, which satisfies our criterion. Instead,
the third row features a conclusion that is false. This row is called a counterexample to
2 This notation originated in another way of reading our definition of validity: for every interpretation M,
if M is a model of every premise (that is if M |= ϕ1 and . . . and M |= ϕn ) then M must also be a model
of the conclusion (that is M |= ψ).
the validity of this inference. It pinpoints a precise interpretation where it fails: for A
true and B false, the premises are true but the conclusion is false. By our definition,
this means that the inference as a whole is not valid (even if many people think it is).
Using the notation we just introduced, ¬A, A → B 6|= ¬B.
Establishing that an inference is valid requires building the truth table of each
premise and of the conclusion. As noted earlier, that can be expensive for formulas
containing a lot of propositional letters.
It is important at this point to look back and reflect on the road we have traveled.
In Chapter 1, when we first encountered Sidra’s inference, “Doha is in Qatar and Sidra
is in Doha, so she is in Qatar”, we intuitively knew it was valid, but we had no way
to convince somebody else that it was. Actually, we said it was valid because it was
universal: anybody in the world with basic mental capabilities would accept it.
By contrast, we have devised a method to mechanically show that the inference “A
and A → B so B” is valid: simply check that for all four interpretations of A and B,
each one that is a model of the premises is also a model of the conclusion. Intuition is
gone. In particular, this mechanical approach is independent of the particular meaning
we assign to the propositional letters A and B: only the structure of the involved logical
formulas (their form) matters. In particular, any English inference that translates to “A
and A → B so B” is automatically valid: that’s because of its form, not its meaning.
Is “If Paul is hungry he stops making sense and Paul is hungry so he stops making
sense” valid? Sure it is: it has the form “A and A → B so B”. Is “If 18 is prime,
then it has exactly two divisor and 18 is prime so 18 has exactly two divisors” valid?
Absolutely, and for the same reason. By expressing inferences in logic and studying
their validity there, we have abstracted from the specific domain they are about (Sidra’s
whereabouts, Paul’s physiology and numbers), from whether specific premises may be
true or false (Sidra was in Qatar when I first used that example, 18 is certainly not
prime, and we don’t know whether Paul is hungry — whoever Paul is), and even from
the language in which those sentences where spoken.
3.6 Exercises
1. Determine whether the following inferences are valid by using the truth table
method:
• B ∨ A and C ∨ ¬A, so B ∧ C
• ¬A and B, so B ∨ ¬A
• A ∧ ¬A and C, so ¬C
• ¬A → B, so A ∨ ¬B
• A ∧ B → C and A → B, so A → C
• A→B→A∧B
• (A → B) ∨ (B → A)
• A ∧ (B ∨ C) ↔ (A ∨ B) ∧ (A ∨ C)
• ¬(A ↔ ¬A)
• ((A → B) → A) → A
3. Four machines, A, B, C and D, are connected to a computer network. It is
feared that a computer virus may have infected the network. Your security team
runs its diagnostic tools and informs you of the following state of affairs:
(a) If D is infected, then so is C.
(b) If C is infected, then A is infected too.
(c) If D is clean, then B is clean but C is infected.
(d) If A is infected, then either B is infected or C is clean.
You trust your security team and have confidence that all these statements are
true. What can you conclude about the individual machines A, B, C and D?
Use the truth table method to figure out their possible state of infection.
4. The CEO of a software company makes the following declaration during the
shareholders meeting:
If security is a problem, then regulation will increase. If security is
not a problem, then business on the Web will grow. Therefore, if reg-
ulation is not increased, then business on the Web will grow.
At the same meeting, the CFO of the company declares the following:
The sad reality was that if the government did not give us a tax break,
we would not meet our 3rd quarter target. Fortunately however, the
government just gave us our tax break. So it is my pleasure to an-
nounce that we will meet our 3rd quarter target!
Write the component statements of each declaration as propositional formulas
and express the overall declarations as inference. Is each inference valid? If so,
use the truth table method to show it. If not, show a model that violates it.
5. Disjunction as we defined it is inclusive: A ∨ B is true also when both A and
B are true. The connective ⊕, pronounced exclusive or or “x” or, disallows this
possibility: A ⊕ B is true when either A or B is true, but not both.
• Write the truth table for ⊕.
• Find one (or more) formulas that are equivalent to A ⊕ B, i.e., that have
the same truth table.
6. In this exercise, we will be interested in what happens when we take a formula
ϕ which contains a propositional letter A and replace A with another formula ψ
everywhere in ϕ. To avoid long-winded sentences, we will write [ψ/A]ϕ for the
resulting formula.
A B G
T T F
T F T
F T T
F F F
Now, we can express any Boolean function by means of a Boolean formula that
uses just ∧, ∨ and ¬.
To start with, for each row j, build the conjunction Rj as follows:
• if input Ai has value T , add the conjunct Ai to Rj
• if input Ai has value F , add the conjunct ¬Ai to Rj
Then, combine the formulas Rj you just obtained for each row j into the dis-
junctive Boolean formula Pos built as follows:
• if the output of G has value T on row j, add the disjunct Rj to Pos
• (if no output of G has value T , Pos is just ⊥)
For instance, in the example above, we get
Another way to express the Boolean function G is to form the conjunctive Boolean
formula Neg from the row formulas Rj as follows:
• if the output of G has value F on row j, add the conjunct ¬Rj to Neg
• (if no output of G has value F , Neg is just >)
For instance, in the above example, we get
(a) Check that this works by verifying that the formula Pos G we just obtained
has the same truth table as G above. Then, explain in your own words why
the construction for Pos works.
(b) Check that this works by verifying that the formula Neg G too has the same
truth table as G above. Then, explain in your own words why the construc-
tion for Neg works.
(c) Check that also Pos G ∧ Neg G has the same truth table as G and explain
why. Notice that this allows to omit the special cases where Pos is ⊥ and
where Neg is >.
(d) Notice that the construction of Neg only uses ¬ and ∧ (and possibly >). By
relying on the de Morgan equivalences, modify the construction for Pos so
that it only uses ¬ and ∨.
(e) The above discussion implies that every Boolean function can be expressed
by a Boolean formula that only uses ¬ and ∧, or ¬ and ∨. Show that is
possible to do so also with only ¬ and →.
8. We saw in Section 3.4 that the de Morgan law, A ∧ B ↔ ¬(¬A ∨ ¬B), is a
tautological equivalence that allows us to define disjunction in terms of negation
and conjunction. This means that any formula that contains disjunctions can
be rewritten into an equivalent formula without it. Said differently, disjunction
is not necessary. In this exercise, we will discover what other connectives are
definable in this way. We will also look for the smallest set of connectives that
can define all others.
(a) Consider the standard set of connectives: {¬, ∧, ∨, →, ↔}. Find equiva-
lences that define
• ↔ in terms of {¬, ∧, ∨, →},
• → in terms of {¬, ∧, ∨},
• ∧ in terms of {¬, ∨}.
For each of these equivalences, show that they are tautologies by displaying
their truth tables.
(b) By the previous exercise, {¬, ∨} seems to be a good candidate as a “small-
est set of connectives” for {¬, ∧, ∨, →, ↔}. Make use of this property to
rewrite the following formulas into equivalent formulas that contain just ¬
and ∨. Do so in steps and justify each step by writing which equivalence
you are using.
• A ∨ B ↔ ¬(¬A ∧ ¬B)
• A→B→A∧B
• (A → C) ∧ (B → C) → (A ∨ B → C)
• (A → (B → C)) → (A → B) → (A → C)
• ((A → B) ↔ (A → ¬B)) → ¬A
We used the standard precedence conventions to limit parentheses prolifer-
ation.
(c) Given a set of connectives C, a subset C 0 of C is minimal if every connec-
tive in C can be defined in terms of just the connectives in C 0 , but this is
not true anymore if we remove any element of C 0 .
We have just seen that any formula that uses {¬, ∧, ∨, →, ↔} is equivalent
to a formula over just {¬, ∨}. Now, show that {¬, ∨} is minimal, i.e., that
there are formulas that we cannot express using ¬ alone or ∨ alone.
(d) The set {¬, ∨} is not the only minimal set of connectives with respect to
{¬, ∧, ∨, →, ↔}. Give at least two other 2-element sets of connectives that
are minimal with respect to {¬, ∧, ∨, →, ↔}.
(e) Let us define a new connective. A ↑ B = ¬(A ∧ B). This new connective
is called nand and therefore A ↑ B is read “A nand B”.
• Write the truth table of A ↑ B.
• Show that nand is sufficient to define all connectives in {¬, ∧, ∨, →,
↔}. Therefore, {↑} is a minimal set of connectives with respect to
{↑, ¬, ∧, ∨, →, ↔}.
• Define another connective with the same property.
9. When using truth tables to describe propositional logic, atomic symbols and
complex propositions can have one of two values: T or F . This makes this
A B ¬A A∧B A∨B
T T F T T
T U F U T
T F F F T
U T U U T
U U U U U
U F U F U
F T T F T
F U T F U
F F T F F
(a) If all we care about are interpretations where the propositional symbols
can only have value T or F (U is not allowed), can we use this 3-valued
truth table instead of our original 2-valued truth table? Explain why. If
the answer is “yes”, the 3-valued logic is said to be conservative over the
2-valued logic.
(b) Viewing U as “unknown”, explain why it is reasonable to define ¬U = U
and > ∧ U = U and also ⊥ ∨ U = U .
(c) What is the 3-valued truth table of implication if we still want A → B to
be equivalent to ¬(A ∧ ¬B)? By the same token, what does the 3-valued
truth table for A ↔ B look like?
(d) Suppose the statement “Flight 237 is on time” is true, the statement “Run-
way conditions are icy” is false, and the truth value of the statement “Flight
51 is on-time” is unknown. Find the truth value of the following statements:
i. Runway conditions are not icy and flight 51 is on time.
ii. Flight 51 is on time and flight 237 is not.
iii. Flight 51 is not on time or runway conditions are not icy.
iv. If the runway conditions are icy then flight 51 is not on time
v. Flight 237 is on time if and only if the runway conditions are not icy
and flight 51 is not on time.
10. Using the truth table method, show that the following inferences are valid.
(a) A ∧ B, A → (B → C) |= A ∧ C
(b) A → (A → B) |= A → B
(c) ¬A ∨ B, B → C |= A → C
(d) |= (A ∨ A → A) ∧ (A → A ∨ A)
(e) A → B, ¬(B ∨ A) |= B → C
11. In Python, Java and other programming languages, Boolean expressions are used
in the condition of if statements and to control loops. They are kind of like
propositional formulas in the sense that they evaluate to true or false, but in
reality they have a richer set of possible behaviors: they can raise exceptions and
their evaluation may not terminate. In this exercise, we will look at this latter
property, called divergence (but we ignore exceptions).
We will study a form of propositional logic where atomic formulas can take the
values T (true), F (false) and % (loop forever) — you can think of this as another
type of 3-valued logic. Composite formulas are built using standard connectives
(typically ∧, ∨ and ¬), and their evaluation proceeds left-to-right. This gives rise
to the following truth table:
A B ¬A A∧B A∨B
T T F T T
T F F F T
T % F % T
F T T F T
F F T F F
F % T F %
% T % % %
% F % % %
% % % % %
(a) Using your favorite programming language, write a Boolean condition con-
taining a variable x such that this condition evaluates to true when x>0, to
false when x=0, and loops forever for x<0. Feel free to define any auxil-
iary code you may need.
(b) Some of the standard propositional equivalences do not hold in this logic.
Give two examples, one with ∧ and the other with ∨ and explain why they
are not valid.
(c) Extend the above truth table with columns for the Boolean expressions
A → B and A ↔ B. Justify your answer on the basis of well-known
equivalences.
(d) Some programming languages include two additional Boolean connectives:
parallel-and (written A u B) and parallel-or (written A t B). These oper-
ators behave just like ∧ and ∨ but they evaluate A and B at the same time
(in parallel). If one of them terminates and that’s enough to determine the
overall value of the expression, it can kill the execution of the other. Write
the truth tables for A u B and A t B. Have you seen these truth tables
before?
12. In natural language, the statement “if it rains, then the road is wet” is closely
related to the inference “it rains so the road is wet”. This is not a coincidence
(and it helps explain why implication is such a weird connective). Validity and
implication are indeed related by a result the property that Γ |= ϕ → ψ iff
Γ, ϕ |= ψ for any set of formulas Γ = ϕ1 . . . ϕn . In this exercise, we will take
inspiration to this result to look at the way validity works.
Derivations
• Their size is exponential in the number of atomic propositions, so they get big
very quickly. If we want to make inferences about addition in a 64-bit processor
and represent each bit with a propositional letter, the 128 bit of input suddenly
yield a table that has 2128 rows. This is enormous! Even if you are very fast, it
would take you more than the age of the universe to write it down!
• We cannot quite use truth tables for more complex logics, like predicate logic
(see Chapter 5).
• Basing inferences on truth and falsehood is simplistic. There are other options
that cannot be expressed using truth tables.
Enter . . . derivations.
4.1.1 Conjunction
Let’s take conjunction. If we know that A ∧ B is true, what inferences can we make?
Well, if A ∧ B is true, for sure A is true. This is the type of elementary inference we
are talking about. Do we know anything else? Of course! If A ∧ B is true, then B
must be true also. Anything else? Not really: every other inference with A ∧ B as a
39
40 4.1. E LEMENTARY D ERIVATIONS
premise has a conclusion that is at least as complicated as A ∧ B itself. Good, let’s list
these inferences:
• ∧E1 : A ∧ B, so A.
• ∧E2 : A ∧ B, so B.
(∧E1 and ∧E2 are the names we will use to refer to them). Observe the way these infer-
ences work: their premise contains the connective we are analyzing (here conjunction)
and their conclusion tells us something simpler that is true if the premises are true.
Inferences of this form are called elimination rules (that’s why their name has “E” in
it).
Now, there is another type of elementary inference about conjunction we can be
interested in: when is it that we can conclude that A ∧ B is true? Well, for A ∧ B to
be true, it must be the case that both A is true and B is true. This gives rise to another
inference:
• ∧I : A and B, so A ∧ B.
Elementary inferences of this type, where the connective we are examining appears in
the conclusion, are called introduction rules (that explains the “I” in the name).
Logicians like to write rules such as the above slightly differently: rather than using
the word “so”, they write a horizontal line with the premises above the line and the
conclusion below it. Here is how a logician would write the above three rules:
A∧B A∧B A B
∧E1 ∧E2 ∧I
A B A∧B
Each of these is an inference rule, or simply rule, which is what logicians call elemen-
tary inferences.
How do we use these rules? Let us show that a more complex inference is valid:
this will be “A ∧ B, so B ∧ A” — conjunction is commutative (that may be obvious,
but there is nothing in the rules that hints at this!). We will start with the premise, and
then use the rules to infer a series of intermediate formulas and eventually produce the
conclusion. We need to be careful to always say which rule we are using on which
formula.
1. A ∧ B assumption
2. A by ∧E1 on (1)
3. B by ∧E2 on (1)
4. B ∧ A by ∧I on (3) and (2)
Success! This series of formulas together with the justification for every step is called
a derivation. We have derived that A ∧ B so B ∧ A is a valid inference, or, said
equivalently, that B ∧ A is derivable from A ∧ B. Logicians have a special notation
for this too: they write the symbol “`” with the premises on the left and the conclusion
on the right, so we have obtained that
A∧B`B∧A
A∧B A∧B
∧E2 ∧E1
B A
∧I
B∧A
4.1.2 Truth
While we learned a lot of things, conjunction alone gets boring after a while. Let’s
look for other rules.
The next logical entity we will look at is the truth constant, >. When is it true?
Well, always, by definition! This immediately yields the following introduction rule:
>I
>
It says that > is always true. What would the elimination rule(s) be? What can we
deduce knowing that something is always true? Nothing interesting, really: knowing
that > is true, we get at most that > is true, which is nothing new. Indeed, there is no
elimination rule for >.
Let us remind ourselves how rules work (phrased a bit differently than before): the
rules for a connective describe all the ways we can unpack the information held in this
connective (the elimination rules) and all the ways we can put together a formula using
that connective (the introduction rules).
4.1.3 Implication
The elimination rule for implication is easy — it is Sidra’s rule: “A and A → B so B”
— and it takes the following form:
A A→B
→E
B
(while this rule is very dear to Sidra, it is known to logicians and philosophers world-
wide as the modus ponens).
What about inferences that introduce implication? Let’s think about the way we
prove things of the form “if . . . then ...” in our math classes. Take the following (silly)
example:
Here, we assumed that the antecedent (“x is odd”) was true, as if it were a temporary
assumption, and we used this assumption to show that the consequent (“x + x is even”)
is true. This suggests the following (strange) rule for introducing an implication:
A
..
.
B
→I
A→B
This rules adds A as an assumption, but we can only use it while deriving B and once
we use →I , we strike it out: it is not an assumption anymore. Let’s see how this works
on an actual example. Let’s show that A → B, B → C ` A → C (remember?
this stands for the inference “A → B and B → C so A → C”: i.e., implication is
transitive).
A A→B
→E
B B→C
→E
C
→I
A→C
Notice that the assumptions of this derivation (the formulas without a horizontal bar
above them) are just A → B and B → C. Indeed A was a temporary assumption until
we used rule →I . Let’s try to write the same derivation in the sequential steps style.
4.1.4 Disjunction
The introduction rules for disjunction are easy: if we know that A is true, then A ∨ B
is true whatever B is. This yields the following pair of symmetric rules:
A B
∨I1 ∨I2
A∨B A∨B
The elimination rule will again be strange. What can we do if we know A ∨ B?
Again, we will take inspiration from the way we do simple proofs. Consider another
silly example:
Whether x is even or odd, x + x is even.
To prove this fact, you would consider two cases, one where x is even and the other
where x is odd, and for each of them prove that the conclusion is true. Let’s do it.
If x is even, then x = 2y. In this case, x + x = 2y + 2y, which is equal to
4y = 2(2y) and so x + x is even.
If instead x is odd, then x = 2y+1. In this case, x+x = (2y+1)+(2y+1),
but this is equal to 4y + 2 = 2(2y + 1) and so x + x is even.
In this proof, we reasoned by cases, assuming each disjunct in turn and showing that
under that (temporary) assumption we could prove the conclusion. We capture this way
of reasoning in the following inference:
A B
.. ..
. .
A∨B C C
∨E
C
Let’s read it out loud: knowing that A ∨ B is true (“x is either even or odd”), and if
we are able to show C (that “x + x is even”) both in the case A were true (“x is even”)
and in the case B were true (“x is odd”), then C must be true.
For practice, let’s give a derivation for A ∨ B ` B ∨ A:
A B
∨I2 ∨I1
A∨B B∨A B∨A
∨E
B∨A
or linearly:
1. A∨B assumption
2. A temporary assumption of (6)
3. B∨A by ∨I2 on (2)
4. B temporary assumption of (6)
5. B∨A by ∨I1 on (4)
6. B∨A by ∨E on (1), (3) assuming (2), and (5) assuming (4)
4.1.5 Falsehood
While we are on the subject of weird rules, let’s look at falsehood: ⊥ is never true. An
immediate consequence of this is that there cannot be any introduction rule for ⊥: it
simply cannot be true!
There is an elimination rule instead, and a strange one: if we were able somehow
to derive ⊥, then we could derive anything we want:
⊥
⊥E
A
To make sense of this, think of a situation where you have been able to derive ⊥ as
some kind of absurd state, where everything is possible. Have you seen the movie
Alice? Once Alice gets to Wonderland (an absurd place), then everything becomes
possible (cats fly, mice talk, playing cards defend castles, etc) [2].
Interestingly, this absurd state has an expression in many languages. For example,
• French has “la semaine des quatre Jeudis” (the week with 4 Thursdays) and
“quand les poules auront des dents” (when chickens will grow teeth)
• Friulan (a Latin language) has “il trentedoi di Mai” (May 32nd)
When you are a kid (and sometimes an adult), these are the dates where all your wishes
will be realized . . .
If that’s still strange to you (and it should be), check the way truth tables work when
the premises are always false: whatever the conclusion is, the inference is valid.
4.1.6 Negation
Let’s end with a bang! Negation is the weirdest of all connectives and the one that has
caused most head scratching among logicians and philosophers. I will make it simple
for you and take ¬A to stand as an abbreviation for A → ⊥, that is ¬A is true if A
implies the absurd state, i.e., if A cannot be.
Under this definition, it is easy to obtain the rules for ¬ by simply using the rules
for → and ⊥ (check it as an exercise!):
A
..
.
⊥ A ¬A
¬I ¬E
¬A B
The introduction rule on the left corresponds to the proof technique based on contra-
diction: to show that ¬A is true, assume temporarily that A is true instead and if you
get a contradiction (⊥), then it must be the case that ¬A is true. The elimination rule
on the right says that if we are ever able to derive something and its negation, then we
are in an absurd state where anything can be true.
B ¬B
¬E
A∨B A A
∨E
A
Having just shown that A ∨ B, ¬B ` A, we know that “A ∨ B, ¬B so A” is a valid
inference, which means that we should be able to use it in other inferences: after all, it
is just an abbreviation of the above derivation. We can pretend that it is a more complex
type of rule (these are called derived rules) and we can write it as
A∨B ¬B
AE
A
This is not the end of the story with negation: I told you it was weird. We need one
more rule, a rule that kind of breaks the pattern we have seen so far, where there is a
single mention of a logical connective or constant in each rule:
¬¬A
¬2
A
This rule is called double negation elimination. It says that if we know that it is not
true that A is not true, then A must be true, as in “it is not true that 1 + 1 6= 2” which
means that 1 + 1 = 2. We will see what role this rule plays in a minute.
How do we know both definitions get us the same notion of valid inference, i.e., that
they are equivalent?
Showing equivalences like these is big business in logic: if you prove it, you get
not one, but two theorems with fancy names.
For succinctness, let’s abbreviate ϕ1 , . . . , ϕn as Γ (the Greek letter Gamma). Show-
ing that if “Γ so ϕ” is valid according to the derivation method, then it is also valid with
the truth table method is called soundness, and we have the following soundness theo-
rem:
Together, these two theorems tell us that the two methods for characterizing valid
inferences are equivalent. Good to know!!!
Here is a cool diagram that illustrates the soundness and completeness theorems:
2 Valid l
Interpretations
Derivations
Γ`ϕ o
Soundness / Γ |= ϕ
Completeness
Γ so ϕ
One thing logicians like doing is trying to get rid of rules and see what happens.
Here, if we remove any rule, then the completeness theorem stops holding. Yet, some
logicians have been really queasy about the double negation rule: it causes a bunch
of strange inferences to be valid. These logicians, who called themselves intuitionists,
decided to get rid of it nonetheless, and that led to one of the most powerful forms of
logic used in computer science: intuitionistic logic (also called constructive logic).
Without double negation elimination (rule ¬2 ), the formula ¬¬A → A is not deriv-
able, that is 6` ¬¬A → A, while it is very easy to write a derivation for it if rule ¬2 is
available. What other examples are there of valid formulas that are not valid intuition-
istically? An important one is the law of excluded middle, ` A ∨ ¬A, which states that
either a formula or its negation must always be true (to which the intuitionists retort
that this is useless unless we have a way to know which one). It is interesting to write a
derivation for A ∨ ¬A using our rule set. As you will see, derivations that involve ¬2
often take a bizarre shape, which is sometimes seen as a kind of contorted reasoning.
(1)
A
∨I1 (2)
A ∨ ¬A ¬(A ∨ ¬A)
¬E
⊥
¬I on (1)
¬A
∨I2 (2)
A ∨ ¬A ¬(A ∨ ¬A)
¬E
⊥
¬I on (2)
¬¬(A ∨ ¬A)
¬2
A ∨ ¬A
Confused? That may give you a sense of how come the intuitionists did not think
highly of double negation elimination and similar rules.1
4.3 Exercises
1. Using the derivation method, show that the following inferences are valid.
(a) A ∧ B, A → (B → C) ` A ∧ C
(b) A → (A → B) ` A → B
(c) ¬A ∨ B, B → C ` A → C
(d) ` (A ∨ A → A) ∧ (A → A ∨ A)
(e) A → B, ¬(B ∨ A) ` B → C
If my client is guilty, then the knife was in the drawer. Either the knife
was not in the drawer or Jason Pritchard saw the knife. But if Jason
Pritchard saw the knife, then it follows that the cabinet was unlocked
on October 10. Furthermore, if the cabinet was unlocked on October
10, then the knife was in the drawer, the chimney was blocked and
also the hammer was in the barn. But we all know that the hammer
was not in the barn. Therefore, ladies and gentlemen of the jury, my
client is innocent.
Typical lawyer talk! But if you are in the jury, should you be convinced by
this argument? Let’s bring propositional logic to the rescue. Identify the atomic
propositions in this text, express the individual parts as formulas and the overall
argument as an inference. Finally, write a derivation that shows that this infer-
ence is valid. Also, say why it is much more convenient to use derivations than
the truth table method in this case.
1 The reasons are actually deeper than this and go beyond the scope of this book.
• At this point, you have shown that most of the derivation rules are sound.
Do you want to do the rest?
• How would you finish off the proof of the soundness theorem? That is,
knowing that Γ ` ϕ, show that it must be the case that Γ |= ϕ?
7. One property that is particularly useful when studying derivations is the substi-
tution lemma. It says that, for any formulas ϕ and ψ and any set of formulas Γ,
if Γ ` ψ and Γ, ψ ` ϕ, then Γ ` ϕ. In words, if we can derive ϕ assuming
ψ, but we have a way to derive ψ on its own, they we can build a derivation of
ϕ directly, without ψ as an assumption. What does this direct derivation of ϕ
look like? [Hint: Drawing pictures is a great way to answer this question.] Do
you want to prove it? One of the previous exercises gives you the tools to do
precisely that!
8. Checking that an inference is valid is easy with the truth table method. With the
derivation method, it does not seem as obvious. We will now examine some of
the vexations involved and mention solutions for them.
• One of the most vexing things about derivations is deciding what rule to ap-
ply when. Here is a simple heuristic that always works: apply elimination
rules to assumptions (and formulas obtained from them by applying elim-
ination rules); apply introduction rules to conclusions (and formulas ob-
tained from them by applying introduction rules); apply introduction rules
also to the leftmost premise of rules →E and ¬E and to the two rightmost
premises of rule ∨E . By doing so, elimination and introduction rules will
always meet at propositional letters. Derivations that follow this heuristic
are called canonical.
• Say we have a hunch that to build a derivation for an inference Γ ` ϕ
formula we need to apply rule →E . To use this rule, however, we need to
come up with a formula ψ to put in its premises. A priori, ψ could be any
formula. That’s infinitely many formulas to try! How do we narrow down
the hunt? One really useful result in this case is the subformula property.
The subformula property tells us that the only candidates for ψ we ever
need to consider as subformulas of the formulas in Γ and ϕ. The same
applies to all other rules that apparently ask us to pull a formula out of our
sleeve.
• One last annoyance about derivations is that it is never clear when to stop
applying rules and declare that an inference is not derivable. One simple
method is to give up on an attempt if we encounter a formula twice on the
same branch of a derivation as long as no new temporary assumptions were
made in between.
Your turn now. Using these tricks, carry out the following tasks:
• Verify that all the example inferences in this chapter were canonical and
abide by the subformula property.
Predicate Logic
51
Chapter 5
Predicate Logic
Looking back, propositional logic is pretty straightforward once we get a grip on how
implication works, and showing that inferences are valid is a breeze using truth tables
(derivations may seem a bit mysterious at first, but all it takes is some practice). Judging
by the exercises we gave, it is also quite useful. Yes, but it doesn’t take long before we
feel the need to capture more linguistic patterns within logic in order to express more
of the world around us.
Predicate logic builds on propositional logic: it retains all the ideas and the methods
that we saw in the last three chapters, and extends them. It brings out the structure of
atomic formulas and introduces quantifiers to leverage this structure. This yields an
enormous jump in what can be expressed in logic and in the type of reasoning we can
do. However, it also significantly complicates showing that an inference is valid. In
this and the next two chapters, we will extend the material in each of the previous three
chapters with what it takes to accommodate these concepts.
All 15-199 students love logic and Sidra is a 15-199 student (5.1)
so Sidra loves logic.
53
54 5.2. T HE S TRUCTURE OF ATOMIC P ROPOSITIONS
Is this a valid inference? Sure! However, we cannot use propositional logic to show
that it is. Indeed, we have little choice but using a dictionary like the following:
A and B so C
which is clearly not valid since A and B can both be true while C is false. Yet, the
above informal inference is valid: everybody with basic mental capabilities anywhere
in the world will agree that it is valid.
To understand what is going on, let’s make some experiments and vary this infer-
ence by making various statements in it false. Consider the following:
All 15-199 students love logic and Johnny Depp is a 15-199 student (5.2)
so Johnny Depp loves logic.
We all would love to see Johnny Depp — maybe in his pirate garb — sitting in class
with us, but he is not, so the second premise is false. Yet, the overall inference is valid,
just as a similar propositional experiment was in Chapter 1. Let’s make yet another
little change:
All 15-199 students love logic and Johnny Depp is a 15-199 student (5.3)
so Sidra loves logic.
This time, the inference cannot be valid because the conclusion has nothing to do with
the premises: back to Aristotle’s definition, it is not necessary knowledge. The fact that
the second premise is false does not influence this.
The good news is that our definition of valid inference is safe. We just saw that
inferences such as the above behave exactly like propositional inferences: intuitively,
they are valid precisely when there is no way the conclusion can be false if the premises
are true.
The bad news is that propositional logic is not expressive enough to capture what
makes inferences (5.1) and (5.2) valid, and (5.3) invalid. We need to come up with an
understanding of these new linguistic patterns and with a logic to reason about them
correctly. That will be predicate logic. Specifically, we will look at a fragment known
as first-order logic.
in Doha then she is in Qatar and Sidra is in Doha so she is in Qatar”, all was needed
was to find the basic propositions, here “Sidra is in Doha” and “She is in Qatar”, and
to interpret the words around them as propositional connectives, obtaining “A → B
and A so B”. Validity came from the way common pieces, here A and B, were used
in different parts of the inference.
When looking at inferences such as (5.1), we need to look inside atomic propo-
sitions, at their structure. The reason this inference is valid has to do with the fact
that the word “Sidra” occurs in both “Sidra is a 15-199 student” and “Sidra loves
logic”, that the phrase “loves logic” appears in both “All 15-199 students love logic”
and “Sidra loves logic”, and so on. To explore this idea, we need to build a little bit
of infrastructure which will set the stage for developing a logic that accounts for this
deeper structure of atomic propositions — predicate logic.
Look at the sentence
• be able to name the people, objects, entities, we want to talk about (here “Sidra”,
earlier “Johnny Depp”);
• be able to describe them (here “is a 15-199 student” and earlier “loves logic”).
Let’s look into these two ingredients in turn, and then look at how they fit together.
5.2.1 Names
Names are labels for the entities (people, things, places, etc) that we refer to in sen-
tences. Here are a few examples:
• “Sidra” is the name of the person Sidra. Note that “Sidra”, a string that we write
on a page, or even the sound that we make when calling her, is not the same as
Sidra the person. They are rather ways to refer to her.
• “Qatar” is a the name of a country. Again, it is not the country but just an agreed
upon way to refer to it. Indeed in ancient history, it was known by other names.
• “Logicomix” is the name of a book. Again, it is not the book, nor any specific
copy we can hold in our hands.
• “Five” is the name of the number 5. Notice that “5” too is a name for this
number in the European numeral system. Once more, neither “five” nor “5” are
the number 5 (which is an abstract idea) but representations of it.
From now on, we will write names in italics, reserving quotes for English phrases, or
pieces of phrases, that we are analyzing.
5.2.2 Predicates
A predicate is a way to describe something. It is a proposition with one or more pieces
missing, pieces that we will denote as “. . . ” for the time being. Here are some examples
on the left-hand side of this table:
• “. . . is a multiple of . . . ” ; multiple(. . . , . . .)
The first two have just one missing piece, they are unary predicates. The next two have
two missing pieces, which makes them binary predicates. The fifth is an example of a
ternary predicate. In general, if a predicate has n missing pieces, it is called an n-ary
predicate. Some predicates have no missing pieces at all, for instance the last example
above, “it is raining”: it is a nullary predicate.
Just like we abbreviated atomic statements with propositional letters in Chapter 2,
it is customary to abbreviate predicates by means of a short string such as student or
logicLover , followed by a list of the missing pieces in parentheses. Here, student and
logicLover are called predicate symbols. And just as with which propositional letter
to associate to a sentence, it is entirely our choice which predicate symbol we use for
each English predicate. The right-hand side of the above list gives a predicate form for
each of our examples. Notice we associated the nullary predicate “it is raining” with
the symbol raining, which has no list of missing parts behind it.
For n-ary predicates with n ≥ 2, we need to agree on which missing piece stands
for what in the English version of the predicate. This is typically specified based on po-
sition. These conventions are expressed by giving a reading of the symbolic predicate.
We will return to this later when we have a better way to distinguish missing pieces.
to express the composite statement “Sidra is a 15-199 student and she loves logic”.
With names and predicates, we have brought out the structure of elementary state-
ments, those minimal true/false sentences that we associated to propositional letters in
Chapter 2. We can then combine them by means of the usual propositional connectives.
So far, we have not done anything we couldn’t do in propositional logic. The additional
structure however opens the door to writing formulas that take advantage of the fact that
the same names appear in different elementary statements within a formula.
5.3 Quantifiers
We often have structured sentences that are true or false without referring to any spe-
cific name. Consider the first premise of inference (5.1):
The part “15-199 students” is just the predicate “. . . is a 15-199 student” and similarly
“loves logic” is the predicate “. . . love logic”. What binds them together is the word
“all” which kind of acts like a name, but a common name between the two predicates.
There are a lot of other words and phrases like “all” that play the same role. Here are
a few:
logic. We are omitting one ingredient to keep things really simple. We will add it back in Chapter 8.
In a sense, these words and phrases describe the quantity of people having the various
properties. They are called quantifiers.
Although there are lots of quantifiers out there, predicate logic deals with just two
of them, two very important ones:
• the universal quantifier, which appears in common language as the phrases “for
all”, “all”, “each”, “every”, “any” and many more. Logicians like to write it
as an upside-down “A”, specifically ∀.
• the existential quantifier, which you encounter as the phrases “there exists”,
“some”, “there are”, and several others. Logicians write it ∃, an inside-out “E”.
That takes care of two of our examples. The quantifier “no” is easily expressible using
the universal and/or the existential quantifier since the sentence “No 15-199 student
loves logic” has the same meaning as “All 15-199 students don’t love logic” and of
“There doesn’t exist a 15-199 student who loves logic”. Quantifiers like “exactly two”
and “at most seven” are a little bit more tricky because they force us to know when two
things (or people) are equal — we will get back to this in Chapter 9. Other quantifiers,
such as “most” and “a few”, are outside of predicate logic: they are linguistic patterns
that predicate logic is unable to reason about.
To get started using quantifiers in formulas, we need to be able to talk about generic
entities rather than specific names. For this, we introduce variables such as x, y and z.
With variables, we can now write predicates that are parametric, like
student(x)
This parametric predicate says that “x is a 15-199 student”, for some unspecified x —
this is the reading of that predicate. By contrast, the predicate student(. . .) focused on
the property of “being a 15-199 student” — a subtle difference in emphasis.
Now, depending on the name we plug in for x, we get elementary propositions that
can be either true or false. For example,
student(Sidra)
is true, while
student(JohnnyDepp)
is false. Now notice that student(x) by itself is neither true nor false, however. It is
suspended in a sense.
Having variables inside predicates, we can use the same variable in different parts
of a formula. For example, we can write
which reads “if x is a 15-199 student, then x loves logic”. Notice that, again, it is
neither true nor false: it depends on who x is. Said this, we can prefix this parametric
formula with quantifiers to obtain some of the statements we considered at the begin-
ning of this section. For example, using the universal quantifier ∀, we can close this
sentence over x as
∀x. student(x) → logicLover (x)
which reads “for all x, if x is a 15-199 student, then x loves logic”, which is a verbose
way to say “All 15-199 students love logic”, the sentence we started with at the begin-
ning of this section. Notice that, differently from student(x) → logicLover (x), this
time we have a statement that is either true or false.
A simple variant of this statement gives us the reading “No 15-199 student loves
logic” — simply negate the consequent of the implication:
or more literally, “for all x, if x is a 15-199 student, then x does not love logic”.
By using the existential quantifier, ∃, in a similar way, we can write
that is “there is x such that x is a 15-199 student and x loves logic”, or more simply
“Some 15-199 students love logic”, which is another true/false statement.
These are typical ways we use the logical quantifiers ∀ and ∃ to express English
sentences that have to do with quantities. As you can see, it is not as direct as with
propositional connectives (or better, English and other natural languages have very
succinct ways of expressing these concepts). It will take a little bit of practice to be-
come comfortable with them and the way they interact with other connectives. For
instance, a formula involving ∀ very often also includes →, as in the above examples,
while ∃ is a good friend of ∧.
with subscripts and primes if necessary. Each predicate symbol takes a certain num-
ber of arguments — this number is called its arity. An atomic formula, or elementary
predicate, is then just an n-ary predicate symbol p applied to n terms t1 , . . . , tn , so that
it looks like p(t1 , . . . , tn ).
Before we define formulas, let’s agree on how to denote them: we use the Greek
letters ϕ and ψ decorated with the usual primes and subscripts when needed, and some-
times write ϕ(x) to indicate a formula ϕ that may contain the variable x (more on this
below). Then a formula is either atomic, a Boolean combination of formulas or a quan-
tified formula. Specifically,
• If ϕ(x) is a formula, then ∀x. ϕ(x) and ∃x. ϕ(x) are also formulas.
These entities constitute a fragment of the language of first-order logic (we will be
introduced to one last ingredient in Chapter 8). First-order logic is a type of predicate
logic — in fact there are other ways to use terms, predicates and quantifiers.
A lot is going on in the above definition. The following schema describes how the
various parts fit together.
GF BC
ED
Connectives
Quantifiers GF
@A BC O
Phrases that are
/ Formulas true or false
(statements)
Predicate
@A
symbols
First-order logic turns two types of phrases from everyday speech into symbols: phrases
that indicate entities we want to talk about are expressed as terms, and phrases that are
true or false (what we called statements) become formulas. Atomic predicates are
where terms enter formulas, as arguments of predicate symbols. They enable us to
express elementary statements about these entities. As in propositional logic, we can
combine formulas using connectives. Additionally, quantifiers tighten up a formula by
fixing how we understand their variables across atomic formulas. Notice that, in first-
order logic, terms occur inside formulas, but there is no way to embed a formula inside
a term.
Although the above definition prescribes that an atomic predicate have the form
p(t1 , . . . , tn ), as in student(Sidra) earlier, logicians are often a lot more relaxed, es-
pecially when writing formulas about mathematics. For example, the predicate for the
phrase “x is less than y” is typically written x < y rather than lessThan(x, y) — the
familiar x < y is actually just <(x, y) where the predicate symbol < is written between
its arguments (or infix as this is called). Similarly, we can write x + 1 ≥ 0 for “the
successor of x is greater than or equal to zero”.2
The scope of the quantifier ∀x is the formula student(y) → logicLover (x). The
occurrence of x in logicLover (x) is bound by this quantifier. The occurrence of the
variable y in student(y) is instead free. A sure way to tell that a variable is free in a
formula is if there is no quantifier in the formula that binds this same variable.
A formula that has no free variables is said to be closed, and open otherwise.
The formula in our last example is open because the variable y in it is free, and
so is the formula student(x) → logicLover (x) in the last section. The formula
∀x. student(x) → logicLover (x) is instead closed. Notice that only closed formu-
las correspond to English sentences: we cannot say if an open formula is true or false
because it depends on what the free variables stands for. From now on, we will be
exclusively interested in closed formulas.
That’s our original inference from Chapter 1. How do we express it in first-order logic?
Well, we first need to decide on what the entities we are interested in talking about are
and what we want to say about them. This means fixing names and predicates. Already
in this simple sentence, we have some choices.
2 We will discover a better way to express this statement in Chapter 8.
• Say however that we are interested in the travels of Sidra and nobody else. Then,
we do not even need to bother giving her a name since everything is about her.
We still need to name the places she goes, here Doha and Qatar , and we need
to have a predicate that describes where she is — let’s pick sidraIn(y) to mean
“Sidra is in place y”. The above sentence is then translated into first-order logic
as follows:
sidraIn(Doha) → sidraIn(Qatar )
and SidraIn(Doha)
so SidraIn(Qatar )
• On the other hand, what we are interested in may be the coming and going of
people in Doha and in Qatar, specifically. We need to give names to people, here
Sidra for Sidra, but we can use specific relations about the places. We then need
two predicates: inDoha(x) to mean that “person x is in Doha”, and inQatar (x)
to mean that “x is in Qatar”. In this scenario, our sentence if formalized as:
• Finally, we may be interested in just Sidra and only about whether she is in Doha
or in Qatar. We do not need any names in this case, just two predicates with no
arguments: sidraInDoha meaning that “Sidra is in Doha” and sidraInQatar to
mean that “Sidra is in Qatar”. Our sentence becomes:
sidraInDoha → sidraInQatar
and SidraInDoha
so SidraInQatar
Which logical representation should we choose, then? It very much depends on the
situation we are trying to describe. If we need to refer to many people and places, then
the last description is too poor. On the other hand, if we care only about Sidra, the first
one, which would include her name in every atomic proposition, is overkill (although
not wrong).
One thing that can force our hand is quantification: say we are asked to interpret
the premise “Doha is in Qatar” as “Everybody in Doha is in Qatar”. This forces us
to quantify over people, thereby excluding our second and fourth options above. In
fact, our choices for turning this premise into a formula would then be limited to either
∀x. isIn(x, Doha) → isIn(x, Qatar ) or ∀x. inDoha(x) → inQatar (x).
As we were expressing the sentence “Doha is in Qatar and Sidra is in Doha, so she is
in Qatar” above, we freely chose names and predicate symbols that reminded us of what
we wanted them to mean. We could have used any symbols however — remember?
we are in the business of symbolic logic. For example, in our first translation, we could
have picked
a = “Sidra”
b = “Doha”
c = “Qatar”
p(x, y) = “x is in y”
which would have given us the following first-order logic inference:
p(a, b) → p(a, c)
and p(a, b)
so p(a, c)
Similarly in our last rendering, we could have chosen D and Q instead of sidraInDoha
and sidraInQatar , right? We would have gotten:
D→Q
and D
so Q
But wait! This is exactly the propositional inference we wrote in Chapter 2!! A nullary
predicate, one with no missing pieces, is just a propositional letter: it can be either true
or false.
All 15-199 students love logic and Sidra is a 15-199 student (5.4)
so Sidra loves logic.
How do we show it is valid? Just like in the propositional case, we will have two ways:
1. We can show that in all situations where the premises are true, the conclusion
must also be true. This will be akin to the truth table method, but as we will see
not quite as easy.
2. We can give elementary derivations for the universal and existential quantifiers,
just like we did with the propositional connectives.
5.7 Exercises
1. For each of the following expressions built out of the predicates tender (x),
wolf (x), sheep(x), cabbage(x) and eats(x, y), determine whether it is a well
formed first-order formula. If it is not, explain what is wrong with it:
• ∀x. ¬tender (x) → sheep(x) ∧ wolf (x)
• eats(sheep(x), cabbage(x))
• ∃x. ∀x. eats(wolf (x)) ↔ eats(x, sheep(x))
• ∀x. ¬eats(cabbage(x), wolf (x))
• ∀x. ∀y. ¬eats(¬tender (x), y)
2. Express each of the following English sentences as formulas in first-order logic :
• No math class is exciting, except logic.
• Every weekday, the traffic is terrible at all hours of the mornings but only
some hours in the evening.
3. Consider the following first-order formula:
C(x) = “x is a Corvette”
P (x) = “x is a Porsche”
F (x) = “x is a Ferrari”
S(x, y) = “x is slower than y”
Translates the following formulas in good fluent English (we do not normally
say things like “for all x such that x is a Corvette, . . . ”):
(a) ∀x. ¬(C(x) ∧ F (x))
(b) ∃x. P (x) ∧ ∀y. F (y) → S(x, y)
(c) ∀x. ∀y. P (x) ∧ S(x, y) → C(y)
q = “Qatar”
o = “Oman”
G(x) = “x is a GCC country”
M (x) = “x has mountains”
N (x, y) = “x is north of y”
Translates the following formulas in good fluent English that a 6-year old can
understand:
(a) ∀x. G(x) → M (x)
(b) ∀x. ∀y. ∀z. N (x, y) ∧ N (y, z) → N (x, z)
(c) ∃x. ∃y. G(x) ∧ ¬G(y) ∧ N (y, x) ∧ N (x, q)
(d) ∀x. ∃y. G(x) ∧ N (x, y) → M (y)
(e) ¬∃x. N (o, x) ∧ G(x)
(f) ∃x. G(x) ∧ ∀y. M (y) → N (x, y)
6. Given the following readings for the constants s, h and c and for the predicates
L(x), P (x) and T (x, y),
s = “Prof. Sieg”
h = “Hilbert”
c = “Cantor”
L(x) = “x is a logician”
P (x) = “x is a philosopher”
T (x, y) = “x talks about y”
translates the following formulas in simple English that a 6-year old can under-
stand:
(a) ∀x. T (s, x) → L(x)
(b) ∀x. T (h, x) → ¬T (x, x)
(c) ∀x. ∀y. T (x, y) ∧ T (y, x) ∧ L(x) → P (y)
(d) ∃x. ∀y. T (y, x) → ∀z. T (z, h)
1 = “the number 1”
x<y = “x is smaller than y”
Prime(x) = “x is a prime number”
Div (x, y) = “x is divisible by y”
Translates the following formulas in good fluent English that a 10-year old can
understand:
12. Some English sentences are ambiguous: they can be translated into logic in
several ways that mean different things. For example “Chemists only admires
themselves” in exercise 2 can be interpreted as “Chemists only admire other
chemists”, or “Each chemist admires just him/herself”. Each of the following
sentences has multiple meanings. Express each of these meanings in predicate
logic (the number of meanings is in brackets).
13. The Greek philosopher Aristotle (384-322 BC) studied under Plato and tutored
Alexander the Great. His early investigations of logic influenced philosophers
for over 2,000 years, that is until formal logic came about in the 1800’s. He cata-
loged four “perfect” syllogisms that medieval scholars named Barbara, Celarent,
Darii and Ferio. Translate each of them below into logical inferences of the form
‘premises so conclusion’.
14. When writing code, an assertion is a formula that is true at the point in the
program where it occurs. Good programmers often write assertions as comments
in their code to justify why they wrote a condition or a statement in a certain way,
and to communicate this reasoning to whoever will maintain the code. Consider
as an example the following commented code for the function that computes the
factorial a number (not that it is complicated, but it is a good example):
void bubblesort(int[] A) { /* */
int n = A.length; /* */
boolean swapped = true; /* */
while (swapped) { /* */
swapped = false; /* */
n--; /* */
for (int i = 0; i < n; i++){ /* */
if (A[i] > A[i+1]) { /* */
int tmp = A[i]; /* */
A[i] = A[i+1]; /* */
A[i+1] = tmp; /* */
swapped = true; /* */
} /* */
} /* */
} /* */
} /* */
Because this exercise is about sorting, you will find it convenient to use the
predicate “sorted (A[j − k])” which expresses the fact that the portion of array A
starting at index j and ending at index k (both included) is sorted.
First-Order Interpretations
true or false?
I don’t know. These are just a bunch of symbols on the page, in the same way as a
propositional letter A is just a symbol — by itself it can be either true or false depending
on what we choose it to stand for. In Chapter 3, we assigned A a truth value by defining
an interpretation for it (remember? an interpretation was just a row in the truth table,
an assignment of truth values to each propositional letter in a formula). We will need
to proceed in the same way here and assign an interpretation to student(Sidra), but
“interpretation” will mean something different.
One idea is to treat elementary formulas such as student(Sidra) just like proposi-
tional letters. Then, we can assign truth values to them and use the truth tables to calcu-
late the truth value of larger formulas that combine them using the Boolean connectives.
But how to compute the truth value of formulas that start with a quantifier? How do
71
72 6.1. E VALUATING E LEMENTARY F ORMULAS
we tell if ∀x. student(x) → logicLover (x) is true or if it is false? Having exposed the
structure of atomic statements, we will need to assign a meaning to all of their parts: the
name Sidra refers to the same entity in both student(Sidra) and logicLover (Sidra),
doesn’t it? Similarly, the predicate symbol student in student(Sidra) and in the above
universal formula should have a consistent meaning. A first-order interpretation makes
this intuition precise.
Let’s start with the intended interpretation of the formula student(Sidra): in our
mind, the name Sidra stands for the person Sidra, and the predicate symbol student
for the unary predicate “. . . is a 15-199 student”. According to this interpretation,
student(Sidra) is certainly true since she is a registered student of 15-199 and shows
up regularly to class.
However this is not the only possible interpretation of these symbols: the name
Sidra could stand for the nearby hospital of the same name and student for the predi-
cate “. . . is a music student”. Then, student(Sidra) would make little sense and there-
fore be false. Less intuitive interpretations are possible too: Sidra could be a code
name for the number 5 and student be a hidden way to refer to the predicate “. . . is
a prime number”, in which case student(Sidra) is true. In Chapter 3, we likened the
rows of a truth table to far away planets, each with its own view of the truth or false-
hood of the propositional letters. Same thing here: in our world, Sidra is the name
of a particular student and student stands for the predicate “. . . is a 15-199 student”.
But these same symbols can indicate something different on other planets (or even in
different parts of our own world). We will need to account for all the meanings that the
inhabitants of the countless places in the universe can give to these symbols.
All this makes intuitive sense. Let’s come up with a general definition of what
constitutes an interpretation for a formula in predicate logic. We need three ingredients:
D={ , , ··· , , , , }
Name D
Sidra
Hanan
.. ..
. .
Iliano
JohnnyDepp
TomCruise
Note that there may be elements of the domain of interpretation we do not have
a name for, like here.
M |= student(Sidra)
Doing the same thing for student(JohnnyDepp) tells us that student(x) is false for
x = , which is the interpretation of the name JohnnyDepp. Therefore,
M 6|= student(JohnnyDepp)
Given an interpretation M, we can always determine the truth value of any ele-
mentary predicate in this way, as long as it does not contain variables.
particular, this means that each nullary predicate is either always true or always false, just like propositional
letters were in Chapter 3.
binary predicate student 0 (x, y) whose intended reading is “x is a student in class y”. Then, the domain of
discourse of x (or more precisely of the first argument of student 0 ) is some set of people, while y will range
over possible class numbers. So, the domain of discourse of an interpretation can be the union of various
sets. In this case we have what is called a multi-sorted interpretation.
∀x. x > 0 → x − 1 ≥ 0
It is true of any of the standard sets of numbers we learn about in school, for example
the natural numbers. So, let’s fix the domain of discourse to be N and use the above
method to show that it is true with respect to that domain.
There is a problem: this table is infinite! So, how can we make sure in a finite time that
the column on the right-hand side only contains T ? We cannot use the above method
to determine whether the formula ∀x. x > 0 → x − 1 ≥ 0 is true or false.3 In general
we cannot show that a universal formula such as the above is true when the domain of
interpretation is infinite.
The same problem emerges when trying to show that an existential formula is
false with respect to an infinite domain of discourse. Take for example the formula
∃x. x > 0 ∧ prime(xx ), where the reading of prime(x) is “x is a prime number”. In
N, the table is infinite and the value of x > 0 ∧ prime(xx ) for progressively larger
values of x is F , but we cannot conclude that there is no value of x that makes it true.4
cases: x = 0, x = 1, and x ≥ 1.
subformula x > 0 → even(x) is infinite, of course, but when we start writing it down,
we quickly find a number that makes it false (actually lots of them):
∀ ∃
T × X
F X ×
after the next — this is what enables us to list them in a table, albeit an infinite one. By contrast, R is not
enumerable: there is no “next” element in R after π, so that we cannot make a table of all real numbers.
Consequently, finding a counterexample in R or other non-enumerable domain is a matter of trial and error.
work too well in practice when M is infinite. Ignoring this “little detail”, we can use
the definition of validity to check whether a given inference is valid in M: it is never
the case that the premises are all true but the conclusion is false.
This seems easy. Consider an interpretation whose domain contains just Sidra, a
well known student and logic lover. Then, the following inference is valid with respect
to this interpretation:
This seems strange: it says that whenever everybody is a student, then everybody loves
logic. What about those students who hate logic? There aren’t any in this interpretation,
but we can certainly think about other interpretations where there are students who do
not like logic.
Our tentative definition above referred to a single interpretation. This is not enough:
it would be like defining the validity of a propositional formula on the basis of a single
row of its truth table. Instead, we know that we need to consider every row, i.e, every
interpretation. The same happens in predicate logic: we need to consider every inter-
pretation to tell whether an inference is valid or not. All interpretations? Even those
nasty infinite ones? Yes!
An inference in predicate logic is valid if every interpretation that makes all its
premises true also makes its conclusion true. Using symbols, an inference ϕ1 and
. . . and ϕn so ψ is valid if for every interpretation M such that M |= ϕ1 and . . . and
M |= ϕn , we have that M |= ψ. As in the propositional case, we then write
ϕ1 , . . . , ϕn |= ψ
As usual, we often use the Greek letter Γ as an abbreviation for the premises ϕ1 , . . . , ϕn ,
obtaining Γ |= ψ. The special case where there are no premises, i.e., |= ψ means that
the formula ψ is true in every interpretation. This is the predicate logic equivalent of
the propositional notion of tautology.
where a is some name and p and q are predicate symbols. This is good to know because
that’s just a generic representation of Sidra’s inference, so
The above definition is however very useful to show that an inference is not valid:
simply find some interpretation that makes all premises true but where the conclusion
is false. For instance, let’s show that
Since there are no premises, all it takes is to show that the conclusion is false in some
interpretation. Take the domain of discourse D to be the natural numbers, N, and
consider the predicate p(x, y) to be x < y. It is then easier to make sense of it if we
rewrite it accordingly as ∀x. (∃y. x < y → ∀z. x < z). Now, this is clearly invalid: for
x = 3, we get ∃y. 3 < y → ∀z. 3 < z and, for any value of y that makes the antecedent
true, for example y = 4, we can find a value of z that makes the consequent false, e.g.,
z = 2. The formula, and therefore the above inference, is false.
6.8 Exercises
1. Construct a first-order interpretation whose domain of discourse contains the
members of your immediate family and defines two predicates for them, male(x)
and female(x). Using this interpretation, check that the formula ∀x. male(x ) ∨
female(x) is true (in most families it is). Then, extend the domain of discourse
so that this same formula is false in this updated interpretation.
find counterexamples to the following formulas where the variables range over
N:
(a) ∀x. prime(x) → odd (x)
(b) ∀x. x > 1 → 2x > 2x
(c) ∀x. prime(x) → prime(2x − 1)
(d) ∀x. ¬prime(x) ∧ x > 1 → prime(2x − 3)
(e) ∀x. ∀y. odd (x + y) → ∀z. prime(z) → odd (xz) ∨ odd (yz)
9. Do the same exercise, but this time with the variables ranging over the given
domain.
As we saw in the last chapter, the definition of valid inference is useful in specific
domains (that’s where model checking rules) or for showing non-validity. However, it
is not a practical approach in general because of the necessity to consider all possible
interpretations, especially those that are infinite.
The other way we can show that an inference is valid is by forgetting about inter-
pretations and instead building a derivation for it. Following the recipe we used in the
propositional case, all that appears to be needed is to give the elementary inference
rules that govern the universal and existential quantifiers, and then combine them with
the rules for the propositional connectives from Chapter 4. Et voilà! We will do pre-
cisely this, and it will be very easy. However, it will not completely solve the problem:
although the infrastructure will be much lighter than the first-order interpretations of
the last chapter, finding a derivation for a valid inference sometimes requires substantial
ingenuity, to the point that no automatic tool can do this for us, in general.
83
84 7.1. U NIVERSAL Q UANTIFIER
Recall that this is a generic version of Sidra’s inference: we have indeed just shown
that
∀x. student(x) → logicLover (x)
and student(Sidra)
so logicLover (Sidra)
is valid. This had eluded us so far.
When can we conclude ∀x. ϕ(x)? To make sense of this, let’s look at how we write
proofs of the form “For all x, . . . ”. Take as an example the simple property
Let n be any number and assume that n is odd. Then there exists m such
that n = 2m + 1. Under this assumption, n + n = (2m + 1) + (2m + 1),
which is equal to 4m + 2 = 2(2m + 1) which is an even number.
Here, we have replaced the variable x with a new symbol that we called n. We used n
to denote an arbitrary number (it is not a nickname for some number we already know
about). We used it as a temporary name in the proof of the property. This suggests one
of those strange rules where we assume stuff specifically for the purpose of deriving a
formula, but cannot use it for other purposes:
(a)
..
.
ϕ(a)
∀I
∀x. ϕ(x)
This rule looks a little bit like, say, the implication introduction rule from Chapter 4, but
there is a big difference: what we are assuming is not a formula (a temporary premise)
but a symbol (a temporary name). Again, we are not allowed to use a outside of the
subderivation of ϕ(a).
There is a lot going on here, so let’s pick the various pieces apart. We first assumed
the antecedent of the implication (essentially applying rule →I ). Then, we created a
new name, i, for the value x that we assumed existed. We then used i to define another
value j and used it as a witness for proving the consequent of the application (using
rule ∃I in fact). Rephrasing this explanation, assuming that the existential formula in
the antecedent was true, we defined a temporary name i for the witness and used it to
prove the consequent. The rule that emerges has the following form:
(a)
ϕ(a)
..
.
∃x. ϕ(x) ψ
∃E
ψ
In words, knowing that ∃x. ϕ(x) is true, we give a name a to this x that is supposed
to exist and assume that ϕ(a) holds. If from this we can prove some other formula ψ,
then ψ can be proved just on the basis of ∃x. ϕ(x). The name a is temporary and can
be used only for the purpose of showing that ψ holds. It cannot appear in ψ itself. The
assumption ϕ(a) is also temporary.
It is interesting to look at the shape of the derivation that proves the simple property
we used as motivation. Ignoring the arithmetic into some vertical dots, this is what it
looks like:
(i)
i2 = −1
.. (j=2i)
.
j2 + 4 = 0
∃I
∃x. x2 = −1 ∃y. y 2 + 4 = 0
∃E
∃y. y 2 + 4 = 0
→I
∃x. x2 = −1 → ∃y. y 2 + 4 = 0
2 Valid l
Interpretations
Derivations
Γ`ϕ o
Soundness / Γ |= ϕ
Completeness
Γ so ϕ
In words, the soundness theorem says that every inference that is shown valid using
derivations is always valid when using the truth-based method.
The completeness theorem says that it also works the other way around: truth-based
validity implies derivability.
The completeness theorem was first proved by the logician Kurt Gödel (1906–1978).
This was quite an achievement at the time.
• To show that this inference is valid using the derivation method, i.e., that Γ ` ϕ,
all we need to do is to come up with a derivation of ϕ from Γ. A single derivation
is enough — that’s promising!
However, the derivation method is of little use to show that an inference is not
valid. In fact, based on what we know, it is not obvious at all to convince oneself
that no derivation exists for an inference.
• We know from Chapter 6 that we can use the truth-based method to show that
Γ 6|= ϕ, i.e., that our inference is not valid. We do so by producing an interpreta-
tion where each of the premises ϕ1 , . . . , ϕn in Γ is true, but the conclusion ϕ is
false. A single interpretation will do — that’s promising too!
On the other hand, Chapter 6 gives us little hope that we can use this method in
practice to show that an inference is valid: we would have to consider all possible
interpretations, and there are infinitely many of them. Definitely not practical.
But maybe we can use them together to establish validity. . . Here’s an idea:
• At the same time, try finding an interpretation M that shows that Γ 6|= ϕ.
The first that succeeds determines the outcome: if we find a derivation for Γ ` ϕ then
the inference is valid. If we find an interpretation M such that M |= ϕi for all ϕi in Γ
but M 6|= ϕ, we know it is not valid. This all hinges on soundness and completeness
ensuring that Γ ` ϕ iff Γ |= ϕ.
Should we declare victory?
Unfortunately, life is not this simple. The problem is with the second part of our
idea. How do we find the interpretation M, among all the infinitely many interpreta-
tions that are out there? We were able to do so for a small example using our ingenuity,
but how to do this in general? There is no mechanical method to do so.
In fact, while it is possible in the propositional case to write a program that always
tells us whether an inference is valid or not, no such program can be written for infer-
ences in predicate logic. It is not a matter of finding a sufficiently smart programmer:
no such program can possibly exist!
The failure of both the truth- and derivation-based methods to always tell whether
a generic inference is valid actually touches on a deep property of predicate logic. No
method will ever be able to determine whether an arbitrary inference is valid or not.
Properties of this type are called undecidability results.
7.6 Exercises
1. Show that the following inferences are valid by giving a derivation:
(a) ` ∀x. (∀y. p(x, y)) → p(x, x)
(b) ∀x. p(x, x), ∀y. ∀z. p(z, y) → q(y, z) ` ∀w. q(w, w)
(c) ∃x. p(x) ∧ q(x) ` ∃x. q(x)
(d) ∃x. ∀y. p(y) ` ∀y. p(y)
(e) ∀x. ∃y. p(x) → q(y), ∀y. ∃z. q(y) → r(z) ` ∀x. ∃z. p(x) → r(z)
2. Using derivations, show that the following inferences are valid.
(a) ∃x. ϕ(x), ∀y. ϕ(y) → ψ(y) ` ∃z. ψ(z)
(b) ∃x. ϕ(x) → ψ(x) ` ∀x. ϕ(x) → ∃x. ψ(x)
(c) ¬∀x. ϕ(x) ` ∃x. ¬ϕ(x)
(d) ` ∀x. ∀y. ϕ(x, y) → ∀y. ∀x. ϕ(x, y)
(e) ` ∃x. ∃y. ϕ(x, y) → ∃y. ∃x. ϕ(x, y)
where ϕ and ψ are generic formulas that may use the variable shown in paren-
theses.
3. Differently from the truth-based method, we can build a derivation for an infer-
ence even if it mentions (unbound) variables. Check it out for yourself by giving
a derivation for p(x), p(x) → q(x) ` q(x). Knowing this, show that whenever
there is a derivation for a generic inference
where each formula may contain variable x (and possibly others), then there is a
derivation of the inference
4. Using any of the methods seen in this and last chapter, determine whether the
following inference is valid: p(c), ∃x. p(x) → q(x) ` q(c).
5. Here’s a sure-fire idea for showing that an inference Γ ` ϕ is valid using just the
derivation method:
Try to build a derivation both for Γ ` ϕ and for Γ ` ¬ϕ. If the former
succeeds, the inference is valid. If the latter succeeds, it is not valid
because then Γ 6` ϕ.
Function Symbols
Predicate logic, as we defined it in the last three chapters, is very expressive. Can we
extend it to capture even more linguistic patterns from everyday speech? You bet! and
logicians all over the world are busy doing precisely that. While there are plenty of
extensions worth looking into, we will focus on just one, one that happens to be part of
the standard definition of first-order logic. This will be enough of a workout. After all,
last time we extended logic (going from propositional to first-order) we made things
pretty complicated, didn’t we?
We will augment the terms of first-order logic with function symbols. Up to now,
the only way we had to refer to an object was by giving it a name. This is fine, but there
are plenty of things we don’t give names to: we indicate them relative to other things.
Consider for example the phrase “the front cover of the book”: if I give you a book,
you know exactly what I mean by its front cover, even though that part of this specific
book does not have an official name. Function symbols allow us to express indirect
references such as “the front cover of”.
91
92 8.1. I NDIRECT R EFERENCES
Can we use a function symbol sister (x) to denote a sister of x? If we could, then the
above sentence would be translated into the formula
∀x. LogicLover (sister (x)) → LogicLover (x).
The problem is that what we mean is not clear: do we mean that if any sister of x loves
logic then x loves logic? Or maybe all sisters of x must love logic to guarantee that x
loves it too?
First-order logic stays clear of these complications. It allows us to use function
symbols in a term only to express relationships that are really functions, when the
indirect reference identifies a unique entity. So, if we use the function symbol sister
to represent sisterhood, then sister (x) shall stand for the sister of x (implying that x
has exactly one sister), not for a sister of x (the more general case where x may have
any number of sisters). Note that the article we use, whether “the” or “a”, is a good
indication as to whether we can use a function symbol to express an indirect reference:
“the” implies that the relationship is functional, “a” that it is not.
Now, how do we deal with sisters loving logic? One way to do so is use a predicate
sister (x, y) to stand for “x is a sister of y”. As a predicate, it is true of false, not an
indirect way to refer to somebody. Then we can write the two meanings of the above
sentence as
∀x. ∃y. sister (x, y) ∧ LogicLover (x) → LogicLover (y)
∀x. ∀y. sister (x, y) ∧ LogicLover (x) → LogicLover (y)
respectively. By using predicates, we are forced to be explicit about the quantification.
Notice however that this forces us to have a name for every sister, something function
symbols make unnecessary when they can be used.
in this case. We have to be careful not to write nonsense, though. One way to do so is to
guard our formulas so that we will only refer to the father-in-law of married people. For
example, if we have a predicate married (x) that we understand as “x is married”, then
the phrase “everybody whose father-in-law loves logic also loves logic” is expressed as
The other way is to make sure that a formula that mentions fatherInLaw will never be
true for unmarried people. For example,
would work if all the people we are dealing with are married.
While we are talking about family relations, we can use a function symbol with two
arguments to denote the first child of two people: firstBorn(x, y) denote “the first-born
child of x and y”. Now, x and y may not have children — in fact they may have never
met — so this is another partial function.
• A variable is a term.
• A name is a term.
Nothing else is a term. We continue writing x, y and z for variables and a, b, c, etc,
for generic names (which we also call constants). We will use the letters f , g and h to
denote generic function symbols. The number n of arguments that a function symbol
f takes is called its arity — just like for predicate symbols.
The last line is what we added with respect to Chapter 5. It says that we can build
a term by applying a function symbol f that takes n arguments to n terms t1 , . . . , tn ,
obtaining f (t1 , . . . , tn ). Although the ti ’s can themselves contain function symbols,
such subterms will eventually have to be just a variable, x say, or just a constant, a for
example.
The rest of the infrastructure of first-order logic remains exactly the same as what
we saw in Chapter 5: an atomic formula is still a predicate symbol applied to terms (in
number equal to its arity), and composite formulas are still constructed using the same
connectives and quantifiers.
Altogether, the way these ingredients are combined to form the full language of
first-order logic is described in the following diagram.
GF BC
ED
Connectives
Quantifiers GF
@A BC O
Phrases that are
/ Formulas true or false
(statements)
Predicate
@A
symbols
@A BC
ED
Phrases that refer
Vars
Terms
O to people or things
Function symbols
The main change with respect to Chapter 5 is that we have a richer language of terms.
Terms still enter formulas as the arguments of predicate symbols, and formulas are still
build on top of atomic propositions using the Boolean connectives and the quantifiers.
Function symbols open the doors to all kinds of new inferences. Consider the
following:
It says that assuming Sidra’s paternal grandfather loves logic and that the love logic is
inherited from one’s father, then it must be the case that Sidra loves logic too. It is a bit
more complicated than previous inferences, but it makes sense: intuitively, it is valid.
To make things shorter and to emphasize that father and LogicLover are just sym-
bols, we will replace them with f and p, respectively (we will keep the name Sidra
however). We get
p(f (f (Sidra)))
and ∀x. p(f (x)) → p(x)
so p(Sidra)
Notice that this is the exact same inference: we have just changed some symbols.
Intuition does not help much any more however. We will be using this inference as our
work horse through the rest of the chapter.
How do we show that it is actually valid? The usual suspects will come to the
rescue: interpretations and derivations.
8.4 Interpretations
Recall that an interpretation M has (so far) three components: a domain of discourse D
that lists all the individuals we are talking about, an interpretation of names that tells us
which individual in D each name refers to, and an interpretation of predicate symbols
that tells us whether a predicate is true or false on each individual (or combination of
individuals) in D.
Now, all we need to add is an interpretation of function symbols which tells us
which individual we are referring to when using a function symbol on terms that rep-
resent some individuals in D. This new component associates to each function symbol,
such as father (x), an actual function in the domain of discourse.
Let’s make things concrete and define a small interpretation for our example. Here’s
our domain of discourse:
D={ , , , ? }
We will get back to ? in just a minute.
Let’s say that the only two names we care about are Sidra and Hanan. Here’s an
interpretation for these names:
Name D
Sidra
Hanan
?
?
? ?
This interpretation of f says that is the father of (which is associated with Sidra
according to our interpretation of names). We use ? to stand for the father of the
people we don’t know who their father is. Here, we don’t know who Hanan’s father
is, nor who the father of Sidra’s father is. We also don’t know who the father of the
unknown person is. Note that the interpretation of f is a total function over the domain
of discourse: ? allows us to deal with incomplete knowledge, or in general with
genuinely partial functions.
To complete M, all we need is an interpretation for the predicate symbol p, which
we do through the following table, for example:
x∈D p(x)
T
F
T
? F
At this point, we can extend this interpretation to all the formulas that appear in our
inference. We get the following table:
Because the last column contains an F , this table tells us that, in our example interpre-
tation, the formula
∀x. p(f (x)) → p(x)
is false. The first premise, p(f (f (Sidra))) is also false in this interpretation, while
the conclusion, p(Sidra) is true. Therefore, our inference is valid with respect to this
interpretation.
However, this does not tell us that the inference is valid in general. For this we
would have to check all possible interpretations, including those with infinite domains
of discourse. Just like in Chapter 6, before we had function symbols in logic, it would
take for ever to check that a universal formula is true or that an existential formula is
false. On the other hand, a single interpretation is sufficient to find counterexamples to
a universal formula (if it is false) or a witness to a universal formula (if it is true).
In summary, given premises Γ = ϕ1 , . . . , ϕn and conclusion ϕ, using the definition
of valid inference (Γ |= ϕ if and only if ϕ is true in all the — infinitely many —
interpretations where each formula in Γ is also true) remains highly impractical.
8.5 Derivations
So, what kind of complicated extension will we need to inflict on derivations to incor-
porate function symbols?
The surprising answer is ... none!
The two elementary rules that dealt with terms in Chapter 7 are ∀E and ∃I :
In the leftmost use of ∀E , we instantiated the variable x with the term f (Sidra).
Instead, in the rightmost use of ∀E , we instantiated this variable with Sidra. Therefore,
this inference is valid!
Not bad, eh!
This is the actual completeness theorem that Kurt Gödel (1906–1978) proved. Logi-
cians were using function symbols even back then.
Function symbols do not make determining whether an inference is valid any sim-
pler. In fact, they complicate things further by introducing new avenues by which
undecidability can sneak in.
8.7 Exercises
1. Consider the following phrases. Identify which ones can and which ones cannot
be expressed using function symbols in logic. Explain why.
For those that cannot be expressed using a function symbol, propose a similar
phrase that can.
2. Consider the following names, function symbols and predicate symbols with the
given readings:
3. Define an interpretation for the inference in Section 8.4 that makes all the premises
true. Is there a “smallest” such interpretation? What does it look like?
5. Function symbols are often used to represent mathematical operations like addi-
tion and functions like squaring. We will explore how to do so for infinite sets
like the natural numbers in Chapter 10. For now, we will limit ourselves to oper-
ations modulo 3 in the finite set Z3 = {0, 1, 2}. Using the standard interpretation
of the operations x + y (addition), x − y (subtraction), x ∗ y (multiplication), of
the function x2 (squaring) and of the predicate x = y (equality) — all modulo
3, determine whether the following inferences are valid and if not give a coun-
terexample.
(a) |= ∀x. x ∗ x = x2
(b) |= ∀x. ∃y. x = y 2
(c) |= ∀x. ∃y. x ∗ y = 1
8. Are function symbols really necessary? In Section 8.2 we saw that we can de-
scribe relations like “x is a sister of y” without the help of function symbols.
But, as we learned in our algebra classes, functions are special cases of relations,
aren’t they? Could we use predicates to express functional references too? Let’s
try it out:
101
Chapter 9
Equality
One form of inference we engage in very frequently is about things being equal. We
do so when talking about people, for example “the first man to walk on the moon” and
“Neil Armstrong” refer to the same person. We do so also when checking that we got
the right change from a cashier: for example if we pay for an item that costs $7.83 with
a $10 bill, we expect the change we get back to be equal to $2.17. Even finding your
car in the parking lot has something to do with equality. So, equality seems to be one
of those recurrent linguistic patterns worth being able to reason about within logic.
and
103
104 9.2. F IRST- ORDER E QUALITY
Are they the same? Chances are that you will answer “no”. But they depict two sheep
in the same way as “Sidra loves logic” and “Sidra loves logic” are two individual sen-
tences. So, yes they identify the same thing, if we mean the abstract idea of “sheep”,
but no they do not depict the same specific animal.
Next, consider the last time you looked at yourself in the mirror (maybe this morn-
ing) and the time before that (yesterday?). Did you see the same person? You would
hope so! But, if you think of it, the time was different, the light was probably not the
same, a few of your cell had been replaced. So, is it really the same person? This is
getting philosophical.
But maybe people and sheep are too complicated to deal with. Then, consider the
following two C functions:
Are they the same? They look different on the page, but arguably they will return the
same output for every input. So they are the same . . . but earlier we argued that the two
copies of “Sidra loves logic” were the same because they were the exact same sequence
of characters while these two programs are not the same sequence of characters . . . so
are they different?
First-order logic provides one way to represent individual entities, and that is by
using terms (possibly with function symbols if we want to refer to them indirectly).
Therefore, equality between entities amounts to determining when two terms are equal.
All we need to do this is to provide a single binary predicate symbol to represent equal-
ity. Following what we are used to in mathematics, we will denote it =, written infix.
Therefore, t1 = t2 is an atomic proposition that we understand as asking whether (or
specifying that) terms t1 and t2 are equal.
1 Equality among generic formulas is a bit more complicated because quantified variables can be renamed
By itself, this is just a notation. It does not say anything about when two terms are
equal: we have expressed equality, but not defined it. Before we do so, let’s look at an
inference that uses equality:
Sidra’s father is called Ali, Ali loves logic, and everybody whose father
loves logic also loves logic, so Sidra loves logic.
This is an inference, and it makes use of the fact that Ali and Sidra’s father are the same
person. Let’s write it in first-order logic:
f (Sidra) = Ali
and p(Ali )
and ∀x. p(f (x)) → p(x)
so p(Sidra)
Here, the first premise, f (Sidra) = Ali , tells us that we consider the term f (Sidra)
and the term Ali to be equal. Recall that we are writing f (x) for “the father of x”.
Up to now, x = y is just a binary predicate, no different from q(x, y): it does
not have any meaning from the point of view of logic. Therefore, if we just use the
definitions seen up to last chapter, this inference cannot be shown to be valid. Yet, it is
intuitively valid if we think about “=” as equality.
The way we will be able to show that this inference, and others like it, are valid
is by refining the notions of interpretation and derivation, so that “=” has a special
meaning. A predicate symbol whose meaning is given by the infrastructure used to
define validity is called interpreted. Therefore, “=” will be an interpreted predicate
symbol. All predicates we have seen so far were uninterpreted because interpretations
and derivations did not treat them specially.
So, what will t1 = t2 mean for us? It will hold, or be true, exactly when t1 and t2
can be shown to be equal . . . that sounds like a circular definition! We will specify what
things we want to consider equal by means of premises that contain “=” — that’s how
we deal with sheep and mirrors, and the other nuances we saw in the previous section.
Using this information, we will be able to infer new equalities (ah! Aristotle again:
logic is new and necessary reasoning) and also to draw conclusions that leverage term
equality.
Now, the usual question: how to we check that an inference that uses equality is
valid? This begets the usual answer: either by means of interpretations or of deriva-
tions.
natural to require that “=” be true exactly when its two arguments are the same element
of the domain of discourse.
Let’s see how this works with respect to the example interpretation we saw in the
last chapter. Our domain of discourse was
D={ , , , ? }
We just added an entry for the name Ali , which we didn’t have in the last chapter. Back
then we had a single predicate symbol to interpret, p, and we did so as follows:
x∈D p(x)
T
F
T
? F
We shall now extend the interpretation of predicates with the following table for x = y,
the interpretation of equality in D:
y∈D
= ?
T F F F
x∈D F T F F
F F T F
? F F F T
The one difference between the interpretation of the predicate p and the interpreta-
tion of “=” is that the latter is fixed while the former could be arbitrary. Specifically,
given any domain of discourse D, the interpretation of t1 = t2 in D will always be true
exactly when t1 and t2 correspond to the same elements of D, and false when they are
different: because “=” is binary, its interpretation will be a two-dimensional matrix;
whatever the domain of discourse, its diagonal will always contain T while all other
entries will have F in them. It is in this respect that “=” is an interpreted symbol: it
has a fixed meaning.
This is all we need to add to our definition of first-order interpretation — very little
compared to what we had to do for function symbols, for example.
Now, with this, we can check the validity of our example inference in the above
interpretation M. Here is how the reasoning goes:
• The premise f (Sidra) = Ali is true because the name Sidra is interpreted to
and the interpretation of the function symbol f maps it to . Now, the name
Ali also corresponds to . The two sides of “=” are therefore the same, so the
table for “=” says that this premise is true.
• The premise p(Ali ) is true according to the interpretation of predicate symbol p
on the individual , which is the interpretation of the name Ali .
• The premise ∀x. p(f (x)) → p(x) is false, for the reasons seen in the last chapter.
• The conclusion p(Sidra) is true since the interpretation of p on is true.
Altogether, the inference is valid in M because it is indeed the case that if all premises
are true (they are not) then the conclusion is also true.
Again, it holds in M, but to ensure that it is valid, we would have to examine all
infinite possible interpretations, which is not directly feasible. As we will see shortly,
this particular inference is valid. Therefore,
f (Sidra) = Ali
p(Ali ) |= p(Sidra)
∀x. p(f (x)) → p(x)
9.4 Derivations
To use derivations, we need to give elimination rule(s) to describe what we can infer
knowing that two terms are equal, and introduction rule(s) to tell us when we can
establish that two terms are equal.
The elimination rule is actually pretty easy to define. How can we use the knowl-
edge that t1 = t2 ? Well, if I have a formula that mentions t2 , then the formula obtained
by replacing some of the occurrences of t2 with t1 should also be a derivable formula.
This suggests the following elimination rule:
t1 = t2 ϕ(t2 )
=E
ϕ(t1 )
This rule is sufficient to show that our example inference in this chapter is derivable.
Here’s a derivation for it:
f (Sidra) = Ali p(Ali ) ∀x. p(f (x)) → p(x)
=E ∀I
p(f (Sidra)) p(f (Sidra)) → p(Sidra)
→E
p(Sidra)
Note that in rule =E we do not need to replace all occurrences of t2 with t1 : only
the ones we choose to. Only in this way can we build a derivation for c = d, p(c, c) `
p(c, d), which may represent the inference “Ali is Sidra’s father and Ali sees himself in
the mirror, so Ali sees Sidra’s father in the mirror”.
Now, how do we establish that two terms are equal? Well, it should always be the
case that Sidra = Sidra. In fact, this should hold for every term. This suggests the
rule
refl
t=t
This is one of the properties of equality we learn in school. It is called reflexivity.
Equality is a reflexive relation.
Are there other circumstances where two terms are equal? Definitely. Say that we
know that Ali = f (Sidra). We should be able to infer that f (Sidra) = Ali , shouldn’t
we? Can we do so on the basis of the rules we have so far? Not quite. This suggests
adding another of the properties we learn in school: symmetry. Symmetry says that we
can swap the order of two terms in an equality. It is defined by the following rule:
t2 = t1
sym
t1 = t2
It says that one way to learn that t1 = t2 is if we already know that t2 = t1 . Note that
without this rule, there is no way to establish the validity of the following variant of the
inference in our ongoing example:
Ali = f (Sidra)
p(Ali ) ` p(Sidra)
∀x. p(f (x)) → p(x)
This is our original inference with the sides of the first premise reversed.
Now, when learning about the properties of equality in school, transitivity comes
next right after reflexivity and symmetry. This is how it would be expressed as an
inference rule:
t1 = t2 t2 = t3
trans
t1 = t3
It says that one way to learn that two terms are equal is to know that they are both equal
to a third term. With transitivity, we can learn that Sidra’s father is the president of the
logic lover society knowing that Sidra’s father is Ali and that Ali is the president of the
logic lover society:
f (Sidra) = Ali
` f (Sidra) = g(LLS )
Ali = g(LLS )
Here the function symbol g(x) stands for “the president of x” and the name LLS
indicates the logic lover society.
When learning about equality in school, there is one last property that is often
mentioned: congruence. Congruence states that we can replace equals for equals. This
is how we solve equations for example. Should we have a rule for congruence? We do
already! That’s our elimination rule =E . Let’s rename it cong to make school teachers
around the world happy:
t1 = t2 ϕ(t2 )
cong
ϕ(t1 )
Γ`ϕ o
Soundness / Γ |= ϕ
Completeness
Γ so ϕ
9.5 Definitions
One common use of equality, for example in math textbooks (but also in everyday life),
is to give a name as an abbreviation of something that takes long to say or write. This
is called a definition.
would typically do is to give him a nickname, say Gido. To do so, we would write, for
example,
Gido = father (father (Sidra))
and use it as a premise. By doing so, we have defined the name Gido to be an abbrevi-
ation for father (father (Sidra)). Using the derivation method, we can then show that
the inference
Gido = father (father (Sidra))
p(Gido) ` p(Sidra)
∀x. p(father (x)) → p(x)
is valid. This inference looks very similar to the main example in this chapter, which
had the premise f (Sidra) = Ali : beside being the name of Sidra’s father, Ali can also
be seen as an abbreviation of f (Sidra).
Now, the paternal grandfather of anybody is that person’s father’s father. We can
then capture the idea of “the grandfather of x” by defining him as “the father of the
father of x”, for any x. We have just defined “grandfather” as an abbreviation that is
parametric in x. The following formula then defines the symbol grandfather in terms
of father :
∀x. grandfather (x) = father (father (x)) (9.1)
This is a definition of the function symbol grandfather . This means that whenever we
see the term grandfather (t), we can replace it with the term father (father (t)) and
vice versa. We abbreviated the complex term father (father (x)) to grandfather (x),
where grandfather is a function symbol of convenience.
Now, let’s do something a bit more challenging. Siblings have the same father.
Given the names x and y of two siblings, we want to define the function symbol
commonFather (x, y) to refer to the name of their common father. Here is one of
the various possible ways to write this definition:3
It defines commonFather (x, y) to be father (x), but only if father (x) = father (y).
This last part imposes a condition, or constraint, on the individuals commonFather (x, y)
refers to. Therefore, this is a conditional definition. Such conditions require making
use of connectives in the definition.
Using connectives opens lots of new opportunities. Just like we used the function
symbol father to refer to somebody’s father, let’s use mother to refer to someone’s
mother. Then the following definition should capture the general concept of grandfa-
ther (as opposed to just paternal grandfather):
Or does it? If we use this formula as the definition of grandfather, we can draw some
surprising inferences. Say that we know that father (father (Sidra)) = Gido and
3 Can you find a simpler way to write it?
father (mother (Sidra)) = Abraham. Then the following inference has a derivation:
(formula 9.2)
father (father (Sidra)) = Gido ` Gido = Abraham
father (mother (Sidra)) = Abraham
It concludes Gido and Abraham are the same person! What has gone wrong? The
problem here is that formula (9.2) allows us to conclude both
Then, symmetry and transitivity combine them into the above unexpected conclusion,
Gido = Abraham.
The root cause of the problem is that grandfather-hood is not a function: everybody
has two grandfathers, not one. Therefore we cannot use a function symbol to define
somebody’s grandfather!
This definition says that the grandfather of x is y if either y is x’s father’s father or if it
is x’s mother’s father.
Here, we have defined the predicate grandfather . It is again an abbreviation. Not
an abbreviation of a (parametric or conditional) term this time, but an abbreviation of
a formula. Specifically the formula on the right-hand side of ↔. Note that x and y act
as parameters in this definition.
The same device allows us to define not just grandfather-hood, but the general
notion of ancestor. Here it is, limited to the paternal line for simplicity:
9.6 Exercises
1. Give derivations for the following inferences that involve both function symbols
and equality.
(a) a = f (b), b = f (c) ` a = f (f (c))
(b) ` (∀x. f (x) = g(x)) ∧ (∀y. f (y) = h(y)) → ∀z. h(z) = g(z)
∀x. ∀y. f (x, y) = f (y, x)
(c) ` ∀y. f (c, y) = y
∀x. f (x, c) = x
∃y. ∀x. f (x) = y
(d) ` ∀x. ∃z. g(f (x)) = z
∀y. ∃z. g(y) = z
∀x. ∀y. f (x, y) = 1 ↔ x = 1 ∨ y = 1,
(e) ∀z. z = 0 ∨ z = 1, ` ∀x. f (0, x) = 0
¬(0 = 1)
2. If you think about it, it looks like we could do away with the derivation rule for
reflexivity, refl, if we replace it with the premise ∀x. x = x. Can we use this
approach to replace all derivation rules for equality? Give similar premises for
those rules for which this idea works (if any), and for those for which it doesn’t
work (if any) explain in detail what is going wrong.
3. Continuing on the last exercise, if you have a premise of the form ∀x. x = x,
is it still necessary to give a special interpretation to the predicate “=” in the
truth-based approach? Explain why or why not.
4. Do we actually need transitivity as an inference rule? Show that we can always
replace rule trans with a combination of the other rules that deal with equality.
This means we don’t really need to have transitivity as a primitive rule. It is a
derived rule.
5. In the previous exercise, we saw that we don’t really need a rule for transitivity.
Can we get rid of other rules as well? Not quite, but we can swap them for
different rules. Show that the rule
t1 = t2 t3 = t2
ac
t1 = t3
can replace commutativity (and transitivity). To do so, give a derivation of each
of the following inferences using rule ac, but without using either sym or trans:
• t1 = t2 ` t2 = t1
• t1 = t2 , t2 = t3 ` t1 = t3
where t1 , t2 and t3 are generic terms.
6. Using the definition of paternal grandfather (9.1) from Section 9.5, show that the
following conclusion is valid using the derivation method
∀x. father (grandfather (x)) = grandfather (father (x))
7. The first common ancestor of two people x and y, written fca(x, y), is x if they
are the same person. It is also x if x is the father of y. Similarly, it is y if y is
the father of x. Otherwise it is the first common ancestor of their fathers. Give
a definition of the function symbol fca(x, y) in first-order logic. Be careful not
to let your definition collapse distinct names (as we did in Section 9.5). [Hint:
make sure that each disjunct is exclusive.]
8. Here are a few of the standard family relations. Identify the ones that can be
expressed in logic by a function definition, and the ones that require a predicate
definition. Then give these definitions.
(a) Daughter
(b) Oldest son
(c) Grandparent (one’s father or mother’s father or mother)
(d) First cousin (somebody sharing a grandparent)
(e) Aunt (the daughter of a grandparent)
(f) Eldest sister
(g) Related (two people sharing a common ancestor)
(h) Youngest uncle
As you do so, you may assume that the function symbols father (x) and mother (x)
and the predicate symbol older (x, y) — x is older than y — have been prede-
fined for you.
9. It is common practice to “define” exclusive-or (written ⊕) as
ϕ ⊕ ψ ↔ (ϕ ∨ ψ) ∧ ¬(ϕ ∧ ψ)
Why is this not an acceptable definition according to what we saw in this chapter?
Can we define exclusive-or in first-order logic?
10. A string is a sequence of characters. Once we fix the set of characters we can
draw from (for example the uppercase letters of the English alphabet), strings
can be described on the basis of the operation of concatenation, written “·” infix
, so that s1 · s2 is the result of placing the string s2 right after s1 , and the empty
string . They obey the following laws:
• Concatenation is associative, i.e., s1 ·(s2 ·s3 ) is the same string as (s1 ·s2 )·s3
for any strings s1 , s2 and s3 .
• The empty string is the left and right identity of concatenation, that is · s
is just s and so is s · , for any string s.
These laws apply anywhere inside a string, not just on the outside. Mathematical
structures that work this way are called semi-groups.
Can you use “=”, as defined in this chapter, to express when two strings are equal
according to these these laws? If so, do it. Otherwise, give another representation
of the notion of equality over strings.
11. We could define “paternal grandfather” using a predicate symbol (rather than
a function symbol as in Section 9.5) exactly in the same way as for (generic)
“grandfather”. Try it out for yourself.
So, it seems that any function definition could be replaced by a predicate defini-
tion. Is this the case? If so, are there drawbacks to doing this? If not, describe
this transformation.
Numbers
One of the things we reason most often about are numbers. I am not just talking about
those interminable math classes, but everyday activities like getting change, deciding
if it’s hot enough outside to leave a sweater at home, etc.
In the last chapter, equality allowed us to check that the $2.17 we got back in
change when making a $7.83 purchase with a $10 bill is the same as the $2.17 change
we needed. But bow do we determine that $2.17 is the expected change for $7.83 from
$10? Sure, we learn how to do subtraction in elementary school, but how do we get
logic to do it?
115
116 10.2. I NFERENCES AND D EFINITIONS
bers. You have probably studied the binary representation, which works pretty much in
the same way, but with just two digits, 1 and 0. Here, thirteen is represented as “1101”,
that is 1×23 +1×22 +0×21 +1×20 . A different example is the roman numerals, where
thirteen is represented as “XIII”. Roman numerals are however a pain to work with.
We will follow this overall tradition, but be as lazy as we can — you will appreciate
that in no time! We will use the unary system, which represents each number by as
many marks — think about prisoners in the movies making scratches in their cell walls
to count the number of days they’ve been in. To do so in logic, we need just two
symbols:
• One constant to represent the number zero. We will write it 0.
• One function symbol to represent the successor of a number. We will use the
function symbol s and write s(n) for the successor of n.
The representation of a number in this way is called a numeral. Therefore, zero is
simply 0, one is represented as the numeral s(0), five as the numeral s(s(s(s(s(0)))))
and thirteen as the very long numeral s(s(s(s(s(s(s(s(s(s(s(s(s(0))))))))))))) — count
the s’s. This is not pretty, but it will be convenient.
It will also be useful to have a predicate that is true of all and only the natural
numbers. This predicate, read “x is a natural number”, will be written nat(x).
A logic that, like the one we are looking at, deals with natural number is called
arithmetic. We are therefore trying to define and reason about arithmetic.
Simply put, zero plus any number is that number — 0 + y = y — and adding the
successor of a number to another is the same as taking the successor of their sum —
(x + 1) + y = (x + y) + 1.
With these definitions in place, we will be interested in showing that properties like
∀x. plus(x, x) = double(x)
are valid. Note that when dealing with numbers, we tend to talk about “properties”
rather than “inferences”. Clearly we are interested in showing that the above formula
is the conclusion of a valid inference with no premises.
As usual, we have two ways to show that an inference is valid. One uses interpre-
tations, the other derivations. Let’s look at both.
42
Note that for elements of D that are not numbers, s can be interpreted as anything
we want.
Clearly, we could have other function symbols in our logic, which would have
their own interpretation.
Interpretation of predicates: In an interpretation of arithmetic, the predicate symbol
nat must be true of all the natural numbers, and false of every other element of
D. So, its interpretation is as follows in our example:
x∈D nat(x)
0 T
1 T
2 T
.. ..
. .
F
F
We may have other predicates to interpret. In the discussion so far, we used “=”
which is interpreted as in the previous chapter. There may be others.
Note that nat is the only way to keep us on the straight and narrow: when we
intend to talk about numbers (and numbers only), we should guard our statements and
definitions with it, or we may not be expressing what we have in mind.
Take the definition of plus above. You would like to be able to prove that
But it doesn’t hold! Take x to be 0 and y and z to be . The first part of the definition
of plus tells us that, for these values, plus(x, y) = z is true, but is not a natural
number, so that the above formula is altogether false.
But we can use nat to make sure the variable y in this formula is a natural number.
We get the following corrected definition:
∀y. nat(y) → plus(0, y) = y
∀x. ∀y. plus(s(x), y) = s(plus(x, y))
which makes sure that we won’t try to say anything about adding things that are not
numbers. But shouldn’t we guard x and y similarly in the second formula? We could,
but in this case this is not necessary.1
We have rediscovered something we knew already: interpretations are good for
showing that an inference does not hold. As usual, showing validity is a lot more
complicated. Note in particular that here the domain D is necessarily infinite.
1 A different approach to making sure that variables can take only certain values is to categorize terms
using sorts, which are similar to types in programming languages. Then by specifying the sort of every
variable we use, we would prevent it from assuming values we are not interested in — just like when writing
a program in Java, say.
10.4 Derivations
So, let’s turn to derivations, which we have seen are a tractable method for checking
validity (although not for showing that an inference does not hold).
We need to come up with elementary inference rules that describe how to introduce
and eliminate our one interpreted predicate, nat(x). The introduction rules are easy:
nat(t)
natI1 natI2
nat(0) nat(s(t))
They simply say that 0 is a natural number, and that s(t) is a natural number whenever
we can show that t is one too. Using them, we can easily show that nat(s(s(s(s(s(0))))))
is derivable — i.e., that five is indeed a natural number. That’s reassuring!
Next, how can we use the information that n is a natural number? Here is the
elimination rule for nat(x), and you tell me what it says.
(x )
nat(x) ϕ(x)
..
.
nat(n) ϕ(0) ϕ(s(x))
natE
ϕ(n)
Let’s see. The premises require that n be a natural number, that ϕ(0) be provable,
and that ϕ(s(x)) hold for an arbitrary natural number x such that ϕ(x) holds. The
conclusion tells us that, in this case, ϕ(n) holds.
Have you seen this before?
The last two premises are the cases of a proof by mathematical induction for ϕ.
You would use them to derive ∀x. ϕ(x), from which ϕ(n) holds by instantiating the
variable x to n — that’s using rule ∀E .
So, yes, that rule corresponds to the principle of mathematical induction. You have
probably seen it written in a different way:
ϕ(0) ∧ (∀x0 . ϕ(x0 ) → ϕ(s(x0 ))) → ∀x. ϕ(x)
where ϕ(x) is any formula (usually called “property”) about the natural number x.
Given what we experienced earlier, we’d better be very precise about ϕ’s argument
really ranging over the natural numbers. Let’s use nat to guard it:
ϕ(0) ∧ (∀x0 . nat(x0 ) ∧ ϕ(x0 ) → ϕ(s(x0 ))) → ∀x. (nat(x) → ϕ(x))
This makes it very close to the above rule. Note that this formula incorporates all
kinds of quantifiers and connectives, while rule natE is a pure elimination rule: it only
mentions the formula construction being eliminated, nat.
You may have seen the principle of mathematical induction written as follows (let’s
drop the nat guards for clarity):
∀P. (P (0) ∧ ∀x0 . P (x0 ) → P (s(x0 ))) → ∀x. P (x)
Here, we have replaced the formula ϕ with a universally quantified formula that we
called P . Doing so takes us outside first-order logic, where we can only quantify on
variables representing terms. In fact, the ability of quantifying over formulas would
bring us into second-order logic. Second-order logic is a lot more powerful than first-
order logic, but it is also a lot more complicated. We will largely stay away from it.
So, let’s see the above rules in action by showing that the formula
∀x. nat(x) → x = 0 ∨ ∃y. nat(y) ∧ x = s(y)
is indeed derivable (from no premises). Here is a derivation for it:
(2) refl
nat(n0 ) s(n0 ) = s(n0 )
∧I
nat(n0 ) ∧ s(n0 ) = s(n0 )
∃I (n0 )
refl
0=0 ∃y. nat(y) ∧ s(n0 ) = s(y)
(1) ∨I1 ∨I2
nat(n) 0 = 0 ∨ ∃y. nat(y) ∧ 0 = s(y) s(n0 ) = 0 ∨ ∃y. nat(y) ∧ s(n0 ) = s(y)
natE (2,3)
n = 0 ∨ ∃y. nat(y) ∧ n = s(y)
→I (1)
nat(n) → n = 0 ∨ ∃y. nat(y) ∧ n = s(y)
∀I (n)
∀x. nat(x) → x = 0 ∨ ∃y. nat(y) ∧ x = s(y)
Wow! And this was a simple property! The moment we start doing interesting things
with natural numbers, the derivations get pretty big. This calls for automation, but we
know that no program can reliably determine validity for all first-order logic inferences.
What to do then? It turns out that, for many common inferences, automatic theorem
provers can give an answer quite quickly. For others, if it is important that we show
validity (e.g., an important theorem or a property of a safety-critical system), then
people will simply try hard, with the help of automated proof assistants.
The opposite is however not true: completeness does not hold in general. Do you
want a counterexample? What about the inference with no premises and ¬(0 = s(0))
as its conclusion?
In any interpretation with the characteristics we have given in Section 10.3, it is
very easy to show that
|= ¬(0 = s(0))
` ¬(0 = s(0))
This may not be evident, but trust me on this! In fact, nothing in our derivation rules
prohibits that various numerals refer to the same number. Everything we have done
would work fine for arithmetic modulo 5 for example.
Our pictorial illustration of the relationship between derivation- and interpretation-
based validity now looks as follows:
2 Valid l
? ?
Interpretations
Derivations
Γ`ϕ
Soundness / Γ |= ϕ
Completeness
Γ so ϕ
But, which of these two notions should be pick as the official definition of validity?
If what we have in mind are the natural numbers as we learn them in school, with
0 6= 1, then we want validity to coincide with the interpretation given in Section 10.3.
Then, to recover completeness, we can strengthen our notion of derivation to match
interpretations. To do so, we need to add just two rules:
s(t1 ) = s(t2 )
¬(s(t) = 0) t1 = t2
Note however that these rules are not in the standard form of elementary inference
rules: they are neither introduction nor elimination rules. Altogether, the resulting
rules constitute what is known as Peano arithmetic after the logician Giuseppe Peano
(1858–1932).
But we may be interested in the more liberal notion of number that the derivation
rules in Section 10.4 gave us. We then get completeness back by making our require-
ments on interpretations for numbers less stringent by simply relaxing the status of 0
and s to uninterpreted symbols. We would keep our interpretation of nat the same,
however.
The first two correspond to the rules we added at the end of the last section to restore
completeness: zero is not the successor of any number, and if two successor numbers
are equal then they are the successors of equal numbers. The last two are exactly our
definition of addition from Section 10.2. Note that, with such a restricted symbol set,
we do not need to guard these axioms with a predicate such as nat.
The axiom schema of mathematical induction has the following form:
ϕ(0)
→ ∀y. ϕ(y)
∧ ∀x. ϕ(x) → ϕ(s(x))
where ϕ(x) is any first-order formula containing the free variable x. This is an axiom
schema rather than a simple axiom: we get an actual axiom for every choice of the
formula ϕ. It therefore describes an infinite set of axioms, albeit a very predictable
one. In particular the premise set of Presburger arithmetic is infinite.
Presburger arithmetic is rather weak: unless enriched with further axioms or addi-
tional function or predicate symbols, it allows us only to express simple properties of
2 Until the end of the 19-th century, arithmetic and other parts of mathematics (for example geometry)
were thought to obey fundamental laws that the mathematicians of the time called axioms. Axioms were
therefore laws of nature, fundamental truths that simply had to be and on which all of mathematics was
founded. Nineteenth century mathematicians started questioning this view, and explored what happens if
these laws were replaced with others. This led to an enormous jump in our understanding of mathematics,
and the birth of modern logic. Nowadays, the word “axiom” is just a synonym for assumption or premise.
addition. In particular, there is no way to define or reason about multiplication in it, let
alone more advanced concepts such as prime numbers. Yet, it is sound and complete
with respect to the interpretation with domain N (without extra elements), 0 mapped to
0, and s and + corresponding to the successor and addition operations over the natural
numbers.
The study of which rules or axioms are needed to obtain a certain interpretation
we have in mind is an area of logic called model theory. Presburger arithmetic is an
excellent example of this.
(1) nat(0)
(2) ∀x. nat(x) → nat(s(x))
(3) ∀x. nat(x) → ¬(0 = s(x))
(4) ∀x. ∀y. nat(x) ∧ nat(y) ∧ s(x) = s(y) → x = y
(5) ∀x. nat(x) → x+0=x
(6) ∀x. ∀y. x + s(y) = s(x + y)
(7) ∀x. nat(x) → x×0=0
(8) ∀x. ∀y. x × s(y) = x + (x × y)
ϕ(0)
→ ∀y. nat(y) → ϕ(y)
∧ ∀x. nat(x) ∧ ϕ(x) → ϕ(s(x))
10.7 Exercises
1. Show that the inference |= ∀x. nat(s(x)) → nat(x) is not valid.
2. Let Γdouble be the set of formulas that defines double. Show that there is a
derivation of
Γdouble ` ∀x. nat(x) → nat(double(x))
4. Taking inspiration to the definitions of double and plus, give definitions for the
following functions on natural numbers:
As you do so, make sure that your definitions apply only to natural numbers.
5. Using the same method, give a definition of subtraction, x − y. Note that this
operation is not defined on all pairs of natural numbers.
6. In the last two exercises, we have been defining functions. Let’s try our hand at
defining predicates instead. Give a definition of the following predicates:
7. We need three symbols to express the natural numbers in binary notation within
first-order logic: the constant 0 for zero, and the function symbols e and o so that
e(x) and o(x) represent 2x and 2x + 1 respectively.
You will have observed by now that the more digits, the shorter the numerals get
and the bigger the definitions become.
8. One appealing way to express the integers (i.e., the elements of the set Z, which
contains the negative numbers) is to use the function symbol p so that p(x) cor-
responds to the predecessor of x. This is in addition to 0 and s. Doing so and
nothing else has one drawback. What is this drawback? How would you fix it?
Your fix, if it is what I expect, does not work in first-order logic without equal-
ity. Can you think of a representation of the integers that does not depend on
equality?
10. Peano arithmetic can be extended in many ways. One such way is to add a
predicate symbol that captures the “less than” relation between number. As in
common mathematics, it is written x < y. Give a definition for this predicate.
How would you extend the axioms of Peano arithmetic to incorporate it?
11. The discussion in this chapter centered around the natural numbers but all con-
cepts we saw apply to many constructions used every day in mathematics and
computer science. Lists, or sequences, are one of them — for simplicity, we will
consider only lists of natural numbers. To represent them, we need one predicate
symbol, list so that list(x) is true exactly when x is a list of natural numbers,
one constant, nil to represent the empty list, and one binary function symbol,
cons so that cons(x, l) is the list with head x and tail l. Your job will be to redo
pretty much everything we did in this chapter, but with lists instead of numbers.
Specifically,
• Define the operation of appending a list to another list, denote append (l1 , l2 ),
which is equal to the list that contains all the elements of l1 followed by all
the elements of l2 , in the order they occur in these lists.
• Define the operation that returns the length of a list.
• Describe what an interpretation must be like to account for lists.
• Give introduction and elimination rules for lists by taking inspiration to
what we did in the case of the natural numbers. The introduction rules
are easy. The elimination rule (yes, there is just one) is another induction
principle, this time on lists.
• Does completeness hold? What would you need to add to achieve it?
Work in Progress
parent(abraham, herb). parent(mona, herb).
parent(abraham, homer). parent(mona, homer).
parent(selma, ling).
11.1 Exercises
1. Define the following family relations on the basis of the predicates used in this
chapter:
• uncle(X,Y) which holds if X is the uncle of Y.
• cousin(X,Y) which hold if X is a cousin of Y.
127
128 11.1. Exercises
You will notice that everybody and his brother (literally) is his own cousin.
You can say that two people should be different by using the built-in predicate
X \= Y.
2. X \= Y is one of those predicates where Prolog is at odds with logic. Consider
the following variant of the definition of member earlier in this chapter.
member2(X, [X|_]).
member2(X, [Y|L]) :- X \= Y, member2(X,L).
Some queries will give you unexpected answers. Find an example and try to
explain why it doesn’t work.
Meta-Logic
So, we have seen that we can get logic to help us check the validity of inferences about
people being in Doha, about things being equal, and about numbers. We can adapt
these techniques to similarly reason about a lot of other entities of interest to computer
scientists, like strings, lists, trees, programs, and much much more.
What about logic itself? Let’s give it a try and see whether we learn something
interesting along the way.
This is a statement about a logical inference, and specifically about its validity. Our
goal, then, is to use logic to make inferences about such statements. As in this example,
we will be extremely interested in using logic to make inferences about validity. To do
so, we need to define a formula Valid (x, y) such that Valid (Γ, ϕ) is true if and only if
the inference Γ so ϕ is valid.
But we cannot do that! We are allowed to replace variables only with terms, but Γ
is a set of formulas (the premises of the inference) and ϕ is a formula (its conclusion).
One thing we can do, however, is to encode formulas and sets of formulas as terms,
and then we will have a shot at defining Valid . Let’s do that, and see where it gets us.
131
132 12.2. L OGIC IN L OGIC
such as Valid . The second is to define Valid itself so that Valid (pΓq, pϕq) is true,
or derivable, exactly when the inference Γ so ϕ is valid, where pΓq and pϕq are the
encodings of Γ and ϕ we came up with.
Propositional Connectives
Let’s start with something really simple, the propositional constant >, which is always
true. If we want to represent > as a term, we can just pick a name for it and use this
name whenever we mean >. Let’s choose the name true for this purpose, since it is
easy to remember. In general, we will write pϕq for the encoding of a formula ϕ. Then
p>q = true.
That was easy. We can do something similar for more complex formulas. Take
conjunction for example. If we want to represent ϕ1 ∧ ϕ2 , then we can first encode ϕ1
and ϕ2 , obtaining pϕ1 q and pϕ2 q, and then fix a binary function symbol, say and , and
apply it to them. Therefore
We can proceed in the same way for all the other connectives, using binary function
symbols or , imp and iff to encode ∨, → and ↔ respectively, the unary function sym-
bol not for ¬, and the constant false for ⊥.
Up to now, we can encode propositional formulas without atomic propositions, like
> ∧ (⊥ → > ∨ ⊥), which gets represented as the term
Atomic Formulas
What about the atomic formulas p(t1 , . . . , tn )? A few things are necessary to encode
them:
We will see in a bit how to achieve the first two. Once we have them, then all we need
is to pick a binary function symbol, say atomic, and define
Before we proceed any further, let’s see how we can encode tuples. We will turn
them into the ordered lists of their constituents. To do this, we need one constant, say
nil , to denote the empty list, and one binary function symbol, say cons, to encode the
extension of a list with an element. Therefore, if x is an element and l is a list, then
cons(x, l) is the list with head x and tail l. Here is the full encoding:
pt1 , t2 , . . . , tn q = cons(pt1 q, pt2 , . . . , tn q)
p·q = nil
Did you notice it is recursive?
Now, how do we encode terms? For terms starting with a function symbol, say
f (t1 , . . . , tn ), we can use the same idea as for atomic predicates: pick a binary function
symbol fun that takes a representation of f and a representation the tuple t1 , . . . , tn .
Therefore
pf (t1 , . . . , tn )q = fun(pf q, pt1 , . . . , tn q).
We can actually use this same representation for names by considering them as function
symbols that take no arguments:
Again, the representation of each name or function symbol can be any constant we
want as long as it doesn’t clash with other constants.
The last kind of terms we need to give an encoding for are variables. We can do
so in several ways. For example, we can assign a distinct number to every variable
and use a unary function symbol var applied to this number as the encoding of this
variable. Therefore, if x is given number 42, then pxq = var (42). We could also use
the string “x”. What matters is that every occurrence of x is encoded in the same way
and that we can tell different variables apart.
Good! We know how to encode terms and tuples of terms.
At this point, we are able to represent any first-order formula that does not make use
of quantifiers. Take for example LogicLover (father (Sidra)) → LogicLover (Sidra)
— remember? we used it to express that if Sidra’s father loves logic, then Sidra must
love logic too. It is encoded as follows on the left-hand side:
imp(atomic(p LogicLover , LogicLover (
cons(fun(f father , father (
cons(fun(f Sidra, nil ), Sidra
nil )), )
nil )), )
atomic(p LogicLover , → LogicLover (
cons(fun(f Sidra, nil ), Sidra
nil )) )
where we have used the constant p LogicLover to represent the predicate symbol
LogicLover , and similarly used the constants f father and f Sidra to represent the
function symbol father and the name Sidra, respectively. The shaded right-hand side
relates each component of this encoding to the corresponding part of our formula (ex-
cept for the infix connective →, which is encoded as the prefix function symbol imp).
That’s getting pretty hairy, isn’t it? Writing
for this term (without bothering about the sausage-making) makes it easier to see what’s
going on.
Quantifiers
To finish off the encoding of formulas, all we need is a representation of the quantifiers.
We can use the technique we employed earlier, but we have to be careful with the
variables. Let’s pick the binary function symbol all to encode the universal quantifier.
Its first argument will be the quantification variable and the second will be an encoding
of the quantified formula. Then,
The same encoding used for x in the first argument should be used for each occurrence
of x in ϕ, at least the ones bound by ∀x.
We can do the same for the existential quantifier:
As an example, the formula ∀x. LogicLover (father (x)) → LogicLover (x) is en-
coded as:
all (var (29), ∀x.
imp(atomic(p LogicLover , LogicLover (
cons(fun(f father , father (
cons(var (29), x
nil )), )
nil )), )
atomic(p LogicLover , → LogicLover (
cons(var (29), x
nil )) )
Sets of Formulas
Since we now know about lists, we can use them to encode the premises of an in-
ference. Therefore, if Γ = ϕ1 , . . . , ϕn , we define pΓq as the list of the encodings
pϕ1 q, . . . , pϕn q.
This completes the representation of the language of first-order logic as terms. That
was not too bad, was it? Observe that none of our encodings involved introducing new
predicates. We were able to represent terms, formulas and sequences of each as terms
by introducing new function symbols and using them in a judicious way.
It says that > is always derivable, whatever assumptions are around. Then, the formula
defines the derivability of p>q (that’s the constant true) from any set of assumptions
(that’s y).
Next, consider the rule ∧I ,
ϕ1 ϕ2
∧I
ϕ1 ∧ ϕ2
It tells us that, if we can derive ϕ1 from the current assumptions, say Γ, and ϕ2 from
Γ, then we can derive of ϕ1 ∧ ϕ2 from Γ. Then, the following first-order formula
captures exactly what we just described:
Derivable(pΓq, pϕ1 q)
∧ Derivable(pΓq, pϕ2 q)
→ Derivable(pΓq, and (pϕ1 q, pϕ2 q))
which is exactly the above process of showing that ϕ1 ∧ ϕ2 are derivable, but at the
level of our representation of first-order logic.
We can do this for every derivation rule of propositional logic. The rules for the
quantifiers make use of some additional operations however. The one that will be most
interesting for us in this chapter is substitution, the operation of replacing every occur-
rence of a variable x with a term t in a formula ϕ(x) — we denoted the result as ϕ(t)
in rules ∀E and ∃I in Chapter 7. We express substitution as the predicate subst(x, y, z),
defined in such a way that subst(pϕ(x)q, t, T ) is true exactly when T = pϕ(t)q, i.e.,
when the term T is the representation of the formula obtained by substituting every
occurrence of x in formula ϕ(x) with term t.1
Encoding derivability for the quantifiers makes use of a second auxiliary predicate,
fresh, which produces a new name, as required by rules ∀I and ∃E . One last auxiliary
predicate is needed to draw the conclusion of an inference from its assumptions: the
predicate lookup is such that lookup(pϕq, pΓq) is true exactly when ϕ ∈ Γ.
Spelling out all the details of the definition of Derivable is not hard but a bit tedious.
We will skip it. What really matters is what this definition is able to do. It captures
the statement “Γ ` ϕ is derivable” very precisely. Specifically, Derivable has the
following two properties:
Note that the second line says that whenever Γ ` ϕ is not derivable, neither is ΓFOL `
Derivable(pΓq, pϕq). It does not say that ΓFOL ` ¬Derivable(pΓq, pϕq) is deriv-
able, which is a much stronger property, as we will see. This is because of the unde-
cidability result we mentioned in Chapter 7.
12.3 Meta-Mathematics
We did it! We have defined when an inference is valid using first-order logic. We are
all set to reason about logic in logic. This is big business in logic, to the point that
it has a fancy name: it is called meta-mathematics when reasoning about aspects of
mathematics in logic, or more narrowly meta-logic when focusing on just logic.
What kind of reasoning tasks shall we do, then? Let us start with some simple
properties, and then get to something a bit more fancy.
1 We treat x as a distinguished variable — typically it is the only free variable in ϕ. It is easy to generalize
subst to take a fourth argument, the representation of which variable to substitute with t.
fied variables rather than terms such as pΓq and pϕq and additional components would force these variables
to stand for the representation of the various entities.
Consistency
One important property of first-order logic is that there is no formula such that both it
and its negation are derivable from no premises. Logicians call this property consis-
tency. Here is its mathematical statement:
If it didn’t hold, rule ¬E would be able to derive every formula we can think of. This
would be absurd: we could show that both “Sidra loves Logic” and “Sidra does not
love logic” are true statements out of thin air. An inconsistent logic, where everything
is true, is not very useful. Good thing first-order logic is consistent!
We can internalize consistency in first-order logic just like we did for the weakening
and substitution lemmas. It is expressed as follows:
ΓFOL ` ¬∃x. Derivable(p·q, x) ∧ Derivable(p·q, not(x)) (12.3)
Consistency is retained even if we add ΓFOL as premises. This means that ΓFOL is
itself consistent, which is expressed by the following lemma.
Lemma 12.3 (Consistency of ΓFOL ) There is no formula ϕ such that ΓFOL ` ϕ and
ΓFOL ` ¬ϕ.
Diagonalization
Recall that we introduced the predicate symbol subst to capture the way substitu-
tion works. In particular, ΓFOL ` subst(pϕ(x)q, t, T ) is derivable exactly when T
is pϕ(t)q, the representation of the result of substituting t for x in ϕ(x). One conse-
quence of this is that the inference
ΓFOL ` ∃z. subst(pϕ(x)q, t, z)
is always derivable. This is by using rule ∃I with witness pϕ(t)q for the variable z.
That’s pretty simple stuff, isn’t it? True, so let’s add a twist. Consider the following
formula ψ(x) where ϕ(z) is any formula you want:
∃z. ϕ(z) ∧ subst(x, x, z)
| {z }
ψ(x)
Stare at the second part, subst(x, x, z), and you will realize it’s wicked! The first
argument of subst stands for the representation of a formula. Instantiating x to the
representation of some formula χ(x), we get the formula ψ(pχ(x)q) which contains
subst(pχ(x)q, pχ(x)q, z) — this formula wants to substitute x in χ(x) with χ(x)’s
own representation! By the above discussion, this has a derivation if and only if z =
pχ(pχ(x)q)q. This is all a bit twisted, but it kind of makes sense, in a weird way.
Let’s then add another twist: we will pick ψ(x) itself as the formula χ(x). That is,
we are looking at the formula δ given by
δ = ψ(pψ(x)q)
This is ψ(x) applied to its own representation. Now that’s twisted!! Let’s unfold it like
we did earlier:
δ = ψ(pψ(x)q)
= ∃z. ϕ(z) ∧ subst(pψ(x)q, pψ(x)q, z)
Now, δ is derivable from ΓFOL exactly when ∃z. ϕ(z) ∧ subst(pψ(x)q, pψ(x)q, z)
is, since they are the same formula. For this formula to be derivable, it must be the
case that there is a term T such that ϕ(T ) ∧ subst(pψ(x)q, pψ(x)q, T ), which in turn
requires that both
ϕ(T ) and subst(pψ(x)q, pψ(x)q, T )
be derivable in ΓFOL .
By the previous observation, the formula on the right is derivable in ΓFOL exactly
for T = pψ(pψ(x)q)q. But notice that this is pδq! This means that δ will be derivable
in ΓFOL exactly when ϕ(pδq) is.
Starting from an arbitrary formula ϕ(z), we were able to construct a formula δ
that is derivable exactly when ϕ(z) applied to δ’s very representation is derivable.
Logicians found this result so remarkable that they gave it a name, the diagonalization
lemma. It is usually expressed in the following way, which is equivalent to what we
have just obtained.
Lemma 12.4 (Diagonalization) For every formula ϕ(z) there exists a formula δ such
that
ΓFOL ` δ ↔ ϕ(pδq)
The formula δ is a self-referential sentence that can be viewed as saying of itself that it
has the property ϕ.
Incompleteness
The diagonalization lemma holds for every formula ϕ(z). So, it holds also if we choose
the formula ¬Derivable(pΓFOL q, z) as ϕ(z). It then tells us that there exists a formula
δG such that
ΓFOL ` δG ↔ ¬Derivable(pΓFOL q, pδG q) (12.4)
This is weird: we are applying Derivable on the representation of the set of premises
ΓFOL that we used to define Derivable. Is this another of those twisted sentences?
Let’s see what it means. We know that Derivable(pΓFOL q, pδG q) is derivable when
δG is derivable from ΓFOL . Then, this inference says that δG is true if and only if it is
not derivable in ΓFOL , and all this has a derivation in ΓFOL !!!
Wicked! But is δG true? Is it false? We’ve got to check. Since δG can use all
kind of things defined in ΓFOL , what we are really asking is whether ΓFOL ` δG or
ΓFOL ` ¬δG .
Assume ΓFOL ` δG :
Since by property (12.1) Derivable represents derivability, for sure we have that
ΓFOL ` Derivable(pΓFOL q, pδG q).
Altogether, we have found that neither ΓFOL ` δG nor ΓFOL ` ¬δG are derivable
in first-order logic. This is odd, because the diagonalization lemma constructed δG
with exactly the kind of things that ΓFOL talks about. This oddity becomes even more
evident if we apply the property (12.2) of ΓFOL on these statements: we get that ΓFOL 6`
Derivable(pΓFOL q, pδG q) and ΓFOL 6` ¬Derivable(pΓFOL q, pδG q).
So, although ΓFOL is a representation of derivability of first-order logic, there is a
statement of first-order logic for which it is unable to give us an answer about whether
it is derivable or not. This inability is called incompleteness. Here is a slightly more
general formulation as a theorem.
Theorem 12.5 (Incompleteness of ΓFOL ) There exists a set of formulas Γ and a for-
mula ϕ such that
like δG , for which this encoding cannot tell us whether it is derivable or not derivable.
First-order logic is itself incomplete.
This is not even just a shortcoming of first-order logic, but of any logic that is
consistent and expressive enough to represent itself, in particular its own notion of
validity. This property was first proved by the logician Kurt Gödel in 1930 and it caused
quite a stir in the logic circles of the time. It is called Gödel’s first incompleteness
theorem.3
Theorem 12.6 (Gödel’s First Incompleteness Theorem) Any consistent logical sys-
tem expressive enough to capture its own notion of validity has statements about itself
that it can neither prove nor disprove.
Gödel’s first incompleteness theorem is one of the most important results in modern
logic. It establishes fundamental limitations about what logic can do, and therefore
about what we can expect from it. It essentially says that we cannot use a logic to
determine the validity of all of its own statements (however some “stronger” logics can
prove all statements of “weaker” logics).
Gödel’s approach to proving this result was similar to what we saw in Section 12.2,
but with some interesting differences. His starting point was any logic that contained
Peano arithmetic from Chapter 10. Rather than encoding this logic as a set of premises
ΓFOL , Gödel represented terms, formulas, everything as numbers — that’s why he
needed arithmetic in the first place. This process is now known as Gödel numbering
or Gödelization (a word now used for any encoding of logic as terms). Numbers are
enough to encode all of logic! The rest of Gödel’s original proof is very similar to what
we saw, with the exception that, for technical reasons, he encoded derivations and the
statement “D is a derivation of Γ ` ϕ” as a jumping board to defining derivability.
First incompleteness theorem? Does that mean there are more? Indeed! Gödel
showed with his second incompleteness theorem that no consistent logic expressive
enough to capture its own notion of validity can prove that it is in fact consistent. This
means in particular that we cannot use first-order logic to prove that first-order logic is
consistent! So, inference (12.3) does not have a derivation even if it expresses a true
statement about first-order logic!! We need a more expressive logic for this . . . and an
even more expressive logic to prove the consistency of that logic . . . and so on . . . for
ever.
right to be! Completeness as in Theorem 8.2 and incompleteness as in Theorem 12.5 have nothing to do with
each other. The first property shows that interpretations and derivations capture the same notion of validity.
The second exposes the inability of many logical systems to tell whether their own inferences are valid.
This unfortunate similarity in names has confused generations of students, and even some professors. Not
only that, but both were proved by the same person!
to decide all first-order inferences, and definitely not consistency. The same applies
to more powerful logics. These limitations of mechanical reasoning also anticipate
limitations of what computers can do, a notion that was established a few years after
Gödel’s results by the logician Alan Turing (1912–1954).
12.5 Exercises
1. Give an encoding of the following formulas:
• ∀x. ∃y. sister (x, y) ∧ logicLover (x) → logicLover (y)
• ∀x. plus(s(0), x) = s(x)
2. Complete the encoding of the language of first-order logic in Section 12.2.1 by
giving a precise mathematical definition of ptq for a term t, of pϕq for a first-
order formula ϕ and of pΓq for a set of formulas Γ = ϕ1 , . . . , ϕn .
3. Give a definition of the substitution predicate subst(x, y, z) discussed in Sec-
tion 12.2.2 in first-order logic. Your definition will need to make use of auxiliary
predicates for terms and tuples — define them as well. As you do so, it will be
clear that you need to augment subst with a fourth argument.
4. Define the predicates
• lookup such that lookup(pΓq, pϕq) is true if and only if ϕ occurs in Γ.
• fresh such that fresh(pΓq, pϕq, a) is true if and only if a is the represen-
tation of a constant that does not occur in Γ and ϕ. To do so, you need to
commit to a specific encoding of names.
5. Now that you have all the ingredients, complete the definition of Derivable.
6. In this exercise, we will explore a variant of Derivable that expresses the state-
ment “D is a derivation of Γ ` ϕ” rather than just “Γ ` ϕ is derivable”. This
is quite close to how Gödel proved his first incompleteness theorem. We will
encode this statement by means of a ternary predicate Derives such that
is a derivation of >. Thus, if we pick the constant TrueI to represent this rule,
then
∀y. Derives(TrueI , y, true)
defines the derivability of p>q from any set of assumptions using TrueI as the
representation of the inference consisting of just rule >I .
If we instantiate the variables to pϕ1 q, pϕ2 q, pΓq, pD1 q and pD2 q respectively,
we get
Collected Rules
ϕ1 ϕ2
.. ..
. .
ϕ1 ∨ ϕ2 ψ ψ ϕ1 ϕ2
∨E ∨I1 ∨I2
ψ ϕ1 ∨ ϕ2 ϕ1 ∨ ϕ2
⊥
⊥E (no ⊥I )
ψ
ϕ
..
.
ϕ ϕ→ψ ψ
→E →I
ψ ϕ→ψ
145
146 A.2. E QUALITY
ϕ
..
.
ϕ ¬ϕ ⊥
¬E ¬I
ψ ¬ϕ
¬¬ϕ
¬2
ϕ
(a)
..
.
∀x. ϕ(x) ϕ(a)
∀E ∀I
ϕ(t) ∀x. ϕ(x)
(a)
ϕ(a)
..
.
∃x. ϕ(x) ψ ϕ(t)
∃E ∃I
ψ ∃x. ϕ(x)
A.2 Equality
t1 = t2 ϕ(t2 )
cong
ϕ(t1 )
t2 = t1 t1 = t2 t2 = t3
refl sym trans
t=t t1 = t2 t1 = t3
A.3 Arithmetic
(x )
nat(x) ϕ(x)
..
.
nat(t) nat(n) ϕ(0) ϕ(s(x))
natI1 natI2 natE
nat(0) nat(s(t)) ϕ(n)
[1] Deborah Bennett. Logic Made Easy: How to Know When Language Deceives You.
W.W. Norton & Company, 2005.
[2] Lewis Carroll. Alice’s Adventures in Wonderland, 1865.
[3] Martin Davis. The Universal Computer: The Road From Leibniz to Turing. A K
Peters/CRC Press, 2nd edition, 2011.
[4] Apostolos Doxiadis and Christos H. Papadimitriou. Logicomix: An Epic Search
for Truth. Bloomsbury, 2009.
147
148 BIBLIOGRAPHY
149
150 INDEX
Scope, 61
Shannon, Claude, 29
Soundness, 46, 86, 98, 120
Statement, 11, 60
atomic, 12
composite, 12
Substitution, 83, 136
Substitution lemma, 137
Symbol, 4
interpreted, 117
Symmetry, 108
Tarski, Alfred, 74
Tautology, 27
Term, 59, 94, 104
Term rewriting, 111
Theorem proving, 39, 88
Transitivity, 108
Truth, 14, 24, 41
Truth table, 23
of connectives, 24
of formulas, 25
Truth value, 24
Turing, Alan, 142
Undecidability, 87, 98
Variable, 58, 59
bound, 61
free, 61
Venn, John, 29