Proof Theory and Philosophy
Proof Theory and Philosophy
Proof Theory and Philosophy
& philosophy
Greg Restall
Philosophy Department
University of Melbourne
restall@unimelb.edu.au
http://consequently.org/writing/ptp
2
[september 18, 2006]
References | 163
3
[september 18, 2006]
introduction
This manuscript is a draft of a guided introduction to logic and its ap- I should like to outline an image
plications in philosophy. The focus will be a detailed examination of which is connected with the most
profound intuitions which I always
the different ways to understand proof. Along the way, we will also experience in the face of logistic.
take a few glances around to the other side of inference, the kinds of That image will perhaps shed more
counterexamples to be found when an inference fails to be valid. light on the true background of
that discipline, at least in my case,
than all discursive description could.
The book is designed to serve a number of different purposes, and it Now, whenever I work even on the
can be used in a number of different ways. In writing the book I have least significant logistic problem,
at least these four aims in mind. for instance, when I search for the
shortest axiom of the implicational
propositional calculus I always have
a gentle introduction to key ideas in the theory of proof: The the impression that I am facing a
powerful, most coherent and most
literature on proof theory contains some very good introductions to resistant structure. I sense that struc-
the topic. Bostock’s Intermediate Logic [9], Tennant’s Natural Lo- ture as if it were a concrete, tangible
gic [87], Troelstra and Schwichtenberg’s Basic Proof Theory [91], and object, made of the hardest metal,
a hundred times stronger than steel
von Plato and Negri’s Structural Proof Theory [56] are all excellent
and concrete. I cannot change any-
books, with their own virtues. However, they all introduce the core thing in it; I do not create anything
ideas of proof theory in what can only be described as a rather complic- of my own will, but by strenuous
ated fashion. The core technical results of proof theory (normalisation work I discover in it ever new details
and arrive at unshakable and eternal
for natural deduction and cut elimination for sequent systems) are rel- truths. Where is and what is that
atively simple ideas at their heart, but the expositions of these ideas in ideal structure? A believer would say
the available literature are quite difficult and detailed. This is through that it is in God and is His thought.
— Jan Łukasiewicz
no fault of the existing literature. It is due to a choice. In each book,
a proof system for the whole of classical or intuitionistic logic is in-
troduced, and then, formal properties are demonstrated about such a
system. Each proof system has different rules for each of the connect-
ives, and this makes the proof-theoretical results such as normalisation
and cut elimination case-ridden and lengthy. (The standard techniques
are complicated inductions with different cases for each connective: the
more connectives and rules, the more cases.)
In this book, the exposition will be somewhat different. Instead
of taking a proof system as given and proving results about it, we will
first look at the core ideas (normalisation for natural deduction, and cut
elimination for sequent systems) and work with them in their simplest
and purest manifestation. In Section 2.1.2 we will see a two-page norm-
alisation proof. In Section 2.2.3 we will see a two-page cut-elimination
proof. In each case, the aim is to understand the key concepts behind
the central results.
5
particular logical system. Instead of attempting to justify this or that
formal system, we will give an overview of the panoply of different
accounts of consequence for which a theory of proof has something
interesting and important to say. As a result, in Chapter 2 we will
examine the behaviour of conditionals from intuitionistic, relevant and
linear logic. The system of natural deduction we will start off with is
well suited to them. In Chapter 2, we also look at a sequent system for
the non-distributive logic of conjunction and disjunction, because this
results in a very simple cut elimination proof. From there, we go on to
more rich and complicated settings, once we have the groundwork in
place.
An accessible example of this a presentation of new results: Recent work in proofnets and other
work is Robinson’s “Proof techniques in non-classical logics like linear logic can usefully illumin-
nets for Classical Logic” [80].
ate the theory of much more traditional logical systems, like classical
logic itself. I aim to present these results in an accessible form, and
extend them to show how you can give a coherent picture of classical
and non-classical propositional logics, quantifiers and modal operators.
The book is filled with marginal notes which expand on and comment
on the central text. Feel free to read or ignore them as you wish, and
to add your own comments. Each chapter (other than this one) con-
tains definitions, examples, theorems, lemmas, and proofs. Each of
these (other than the proofs) are numbered consecutively, first with
the chapter number, and then with the number of the item within the
chapter. Proofs end with a little box at the right margin, like this:
The manuscript is divided into three parts and each part divides into
two chapters. The parts cover different aspects of logical vocabulary.
First, propositional logic; second, quantifiers, identity and existence;
third, modality and truth. In each part, the first chapter covers lo-
gical tools and techniques suited to the topic under examination. The
second chapter both discusses the issues that are raised in the tools
& techniques chapter, and applies these tools and techniques to differ-
ent issues in philosophy of language, metaphysics, epistemology, philo-
sophy of mathematics and elsewhere.
Each ‘Tools & Techniques’ chapter contains many exercises to complete.
Logic is never learned without hard work, so if you want to learn the
6 introduction
[september 18, 2006]
Greg Restall
Melbourne
september 18, 2006
7
[september 18, 2006]
p q p or q
0 0 0
0 1 1
1 0 1
1 1 1
9
someone is using “or” in the way that you do if you are disposed to
make the following deductions to reason to a disjunction
p q
p or q p or q
and to reason from a disjunction
[p] [q]
· ·
· ·
· ·
p or q r r
r
That is, you are prepared to infer to a disjunction on the basis of either
disjunct; and you are prepared to reason by cases from a disjunction.
Is there any more you need to do to fix the use of “or”? That is, if you
and I both use “or” in a manner consonant with these rules, then is
there any way that our usages can differ with respect to meaning?
Clearly, this is not the end of the story. Any proponent of a proof-
first explanation of the meaning of a word such as “or” will need to
say something about what it is to accept an inference rule, and what
sorts of inference rules suffice to define a concept such as disjunction
(or negation, or universal quantification, and so on). When does a
definition work? What are the sorts of things that can be defined using
inference rules? What are the sorts of rules that may be used to define
these concepts? We will consider these issues in Chapter 3.
These are three examples of the kinds of issues that we will consider in
the light of proof theory. Before we can broach these topics, we need
to learn some proof theory. We will start with proofs for conditional
judgements.
11
part i
propositional logic
13
[september 18, 2006]
propositional logic:
tools & techniques
2
2.1 | natural deduction for conditionals
We start with modest ambitions. In this section we focus on one way of
understanding inference and proof—natural deduction, in the style of
Gentzen [33]—and we will consider just one kind of judgement: con- Gerhard Gentzen, German Logician:
ditionals. Conditional judgements are judgements of the form Born 1909, student of David Hil-
bert at Göttingen, died in 1945 in
World War II. http://www-groups.
If . . . then . . . dcs.st-and.ac.uk/~history/
Mathematicians/Gentzen.html
To make things precise, we will use a formal language in which we can
express conditional judgements. Our language will have an unending
supply of atomic formulas
p, q, r, p0 , p1 , p2 , . . . q0 , q1 , q2 , . . . r0 , r1 , r2 , . . .
15
Our last example of an offending formula—p → p—does not offend
nearly so much. It is not ambiguous. It merely offends against the
letter of the law laid down, and not its spirit. I will feel free to use ex-
pressions such as “p → p” or “(p → q) → (q → r)” which are missing
their outer parentheses, even though they are, strictly speaking, not in
If you like, you can think of formula.
them as including their outer Given a formula containing at least one arrow, such as (p → q) →
parentheses very faintly, like
this: ((p → q) → (q → r)). (q → r), it is important to be able to isolate its main connective (the last
arrow introduced as it was constructed). In this case, it is the middle
arrow. The formula to the left of the arrow (in this case p → q) is said
to be the antecedent of the conditional, and the formula to the right is
the consequent (here, q → r).
We can think of these formulas in at least two different ways. We can
think of them as the sentences in a toy language. This language is
either something completely separate from our natural languages, or
it is a fragment of a natural language, consisting only of atomic ex-
pressions and the expressions you can construct using a conditional
construction like “if . . . then . . . ” On the other hand, you can think
of formulas as not constituting a language themselves, but as construc-
tions used to display the form of expressions in a language. Nothing
here will stand on which way you understand formulas. In either case,
we use the conditional p → q to represent the conditional proposition
with antecedent p and consequent q.
Sometimes, we will want to talk quite generally about all formulas of
a particular form. We will want to do this very often, when it comes
to logic, because we are interested in the structures or forms of valid
arguments. The structural or formal features of arguments apply gen-
erally, to more than just a particular argument. (If we know that an
argument is valid in virtue of its possessing some particular form, then
other arguments with that form are valid as well.) So, these formal or
structural principles must apply generally. Our formal language goes
some way to help us express this, but it will turn out that we will not
want to talk merely about specific formulas in our language, such as
(p3 → q7 ) → r26 . We will, instead, want to say things like
This can get very complicated very quickly. It is not at all convenient
to say
A → (B → C) [A](1)
B→C B
C
[1]
A→C
So, it seems we can reason like this. At the step marked with [1], we
make the inference to the conditional conclusion, on the basis of the
reasoning up until that point. Since we can infer to C using A as an
assumption, we can conclude A → C. At this stage of the reasoning, A
is no longer active as an assumption: we discharge it. It is still a leaf
of the tree (there is no node of the tree above it), but it is no longer
an active assumption in our reasoning. So, we bracket it, and annotate
the brackets with a label, indicating the point in the demonstration at
which the assumption is discharged. Our proof now has two assump-
tions, A → (B → C) and B, and one conclusion, A → C.
[A](i)
A→B A ·
·
→E ·
B B
→I,i
A→B
(1)
[C → A](2) [C](1)
A → B [A] →E
(2)
→E [A → B](1) [A](2) [A → B](3) A
[B → C] B →E →E
→E B B
C →I,1 →I,1
→I,1 (A → B) → B C→B
A→C →I,2 →I,2
→I,2 A → ((A → B) → B) (C → A) → (C → B)
(B → C) → (A → C) →I,3
(A → B) → ((C → A) → (C → B))
suffixing (inference) assertion (formula) prefixing (formula)
A → (A → B) A
→E
A→B A
→E
B
1. A ∴ A is valid.
2. X, A ∴ B is v-valid if and only if X ∴ A → B is v-valid.
3. If X, A ∴ B and Y ∴ A are both v-valid, so is X, Y ∴ B.
4. If X ∴ B is affine or standardly valid, so is X, A ∴ B.
5. If X, A, A ∴ B is relevantly or standardly valid, so is X, A ∴ B.
Proof: It is not difficult to verify these claims. The first is given by the
proof consisting of A as premise and conclusion. For the second, take
a proof π from X, A to B, and in a single step →I, discharge the (single
instance of) A to construct the proof of A → B from X. Conversely,
if you have a proof from X to A → B, add a (single) premise A and
apply →E to derive B. In both cases here, if the original proofs satisfy
a constraint (vacuous or multiple discharge) so do the new proofs.
For the third fact, take a proof from X, A to B, but replace the in-
stance of assumption of A indicated in the premises, and replace this
with the proof from Y to A. The result is a proof, from X, Y to B as
desired. This proof satisfies the constraints satisfied by both of the
original proofs.
For the fourth fact, if we have a proof π from X to B, we can extend
this as follows
X
·
·π
·
B
→I
A→B A
→E
B
to construct a proof from to B involving the new premise A, as well as
the original premises X. The →I step requires a vacuous discharge.
Finally, if we have a proof π from X, A, A to B (that is, a proof with
X and two instances of A as premises to derive the conclusion B) we
X, [A, A](i)
·
·π
·
B
→I,i
A→B A
→E
B
Now, we might focus our attention on the distinction between those
arguments that are valid and those that are not—to focus on facts about
validity such as those we have just proved. That would be to ignore the
distinctive features of proof theory. We care not only that an argument
is proved, but how it is proved. For each of these facts about validity,
we showed not only the existential fact (for example, if there is a proof
from X, A to B, then there is a proof from X to A → B) but the stronger
and more specific fact (if there is a proof from X, A to B then from this
proof we construct the proof from X to A → B in this uniform way).
» «
It is often a straightforward matter to show that an argument is valid.
Find a proof from the premises to the conclusion, and you are done.
Showing that an argument is not valid seems more difficult. According
to the literal reading of this definition, if an argument is not valid there
is no proof from the premises to the conclusion. So, the direct way to
show that an argument is invalid is to show that it has no proof from
the premises to the conclusion. But there are infinitely many proofs!
You cannot simply go through all of the proofs and check that none
of them are proofs from X to A in order to convince yourself that the
argument is not valid. To accomplish this task, subtlety is called for.
We will end this section by looking at how we might summon up the
required skill.
One subtlety would be to change the terms of discussion entirely,
and introduce a totally new concept. If you could show that all valid
arguments have some special property – and one that is easy to detect
when present and when absent – then you could show that an argu-
ment is invalid by showing it lacks that special property. How this
might manage to work depends on the special property. We shall look
at one of these properties in Section 2.5 when we show that all valid ar-
guments preserve truth in models. Then to show that an argument is
invalid, you could provide a model in which truth is not preserved from
the premises to the conclusion. If all valid arguments are truth-in-a-
model-preserving, then such a model would count as a counterexample
to the validity of your argument.
[A](i)
· ·
· π1 · π2
· ·
B · after: A
before: · π2 ·
→I,i · · π1
A→B A ·
→E B
B
The result is a proof of B from the same premises as our original proof.
The premises are the premises of π1 (other than the instances of A that
were discharged in the other proof) together with the premises of π2 .
This proof does not go through the formula A → B, so it is, in a sense,
simpler.
Well . . . there are some subtleties with counting, as usual with our
proofs. If the discharge of A was vacuous, then we have nowhere to
plug in the new proof π2 , so the premises of π2 don’t appear in the
final proof. On the other hand, if a number of duplicates of A were
discharged, then the new proof will contain that many copies of π2 ,
and hence, that many copies of the premises of π2 . Let’s make this
discussion more explicit, by considering an example where π1 has two
instances of A in the premise list. The original proof containing the
A → (A → B) [A](1)
→E
A→B [A](1) [A](2)
→E →I,2
B (A → A) → A A → A
→I,1 →E
A→B A
→E
B
We can cut out the →I/→E pair (we call such pairs indirect pairs)
using the technique described above, we place a copy of the inference
to A at both places that the A is discharged (with label 1). The result is
this proof, which does not make that detour.
[A](2)
→I,2
(A → A) → A A → A [A](2)
→E →I,2
A → (A → B) A (A → A) → A A → A
→E →E
A→B A
→E
B
which is a proof from the same premises (A → (A → B) and (A →
A) → A) to the same conclusion B, except for multiplicity. In this
proof the premise (A → A) → A is used twice instead of once. (Notice
too that the label ‘2’ is used twice. We could relabel one subproof to
A → A to use a different label, but there is no ambiguity here because
the two proofs to A → A do not overlap. Our convention for labelling
is merely that at the time we get to an →I label, the numerical tag is
unique in the proof above that step.)
We have motivated the concept of normality. Here is the formal defin-
ition:
So, a normal proof is one without any indirect pairs. It has no detour
formulas.
Normality is not only important for proving that an argument is in-
valid by showing that it has no normal proofs. The claim that every
valid argument has a normal proof could well be vital. If we think of
the rules for conditionals as somehow defining the connective, then
proving something by means of a roundabout →I/→E step that you
definition 2.1.10 [subformulas and parse trees] The parse tree for
an atom is that atom itself. The parse tree for a conditional A → B
is the tree containing A → B at the root, connected to the parse tree
for A and the parse tree for B. The subformulas of a formula A are
those formulas found in A’s parse tree. We let sf(A) be the set of all
subformulas of A. sf(p) = {p}, and sf(A → B) = {A → B}∪ sf(A)∪ sf(B).
To generalise, when X is a multiset of formulas, we will write sf(X) for
the set of subformulas of each formula in X.
Notice that this is not the case for non-normal proofs. Consider the
following circuitous proof from A to A.
[A](1)
→I,1
A→A A
→E
A
Proof: Normal proofs from p to q (if there are any) contain only for-
mulas in sf(p, q): that is, they contain only p and q. That means they
contain no →I or →E steps, since they contain no conditionals at all.
It follows that any such proof must consist solely of an assumption.
As a result, the proof cannot have a premise p that differs from the
conclusion q. There is no normal proof from p to q.
Consider the second example: If there is a normal proof of p → (q →
r), from p → r, it must end in an →I step, from a normal (relevant)
proof from p → r and p to q → r. Similarly, this proof must also end
in an →I step, from a normal (relevant) proof from p → r, p and q to r.
Now, what normal relevant proofs can be found from p → r, p and q to
r? There are none! Any such proof would have to use q as a premise
somewhere, but since it is normal, it contains only subformulas of p →
r, p, q and r—namely those formulas themselves. There is no formula
involving q other than q itself on that list, so there is nowhere for q
to go. It cannot be used, so it will not be a premise in the proof. There
is no normal relevant proof from the premises p → r, p and q to the
conclusion r.
[A](i)
· ·
· π1 · π2
· ·
B · A
· π2 ·
→I,i · · π1
A→B A ·
→E B
·
B ·
· ·
· C
·
C
If there is no π 0 such that π π 0 , then π 0 is normal. If π0 π2
··· πn we write “π0 ∗ πn ” and we say that π0 reduces to πn in a
We allow that π ∗ π. A proof number of steps. We aim to show that for any proof π, there is some
reduces to itself in zero steps. normal π∗ such that π ∗ π∗ .
The only difficult part in proving the normalisation theorem is show-
ing that the process reduction can terminate in a normal proof. In the
case where we do not allow duplicate discharge, there is no difficulty at
all.
Proof [Theorem 2.1.13: linear and affine cases]: If π is a linear proof,
or is an affine proof, then whenever you pick an indirect pair and nor-
malise it, the result is a shorter proof. At most one copy of the proof
π2 for A is inserted into the proof π1 . (Perhaps no substitution is made
in the case of an affine proof, if a vacuous discharge was made.) Proofs
have some finite size, so this process cannot go on indefinitely. Keep de-
leting indirect pairs until there are no pairs left to delete. The result is a
normal proof to the conclusion A. The premises X remain undisturbed,
except in the affine case, where we may have lost premises along the
way. (An assumption from π2 might disappear if we did not need to
make the substitution.) In this case, the premise multiset X 0 from the
normal proof is a sub-multiset of X, as desired.
If we allow duplicate discharge, however, we cannot be sure that in
normalising we go from a larger to a smaller proof. The example on
page 26 goes from a proof with 11 formulas to another proof with 11
formulas. The result is no smaller, so size is no guarantee that the
process terminates.
To gain some understanding of the general process of transforming a
non-normal proof into a normal one, we must find some other measure
The crucial features of complexity are that each formula has a finite
complexity, and that the proper subformulas of a formula each have a
lower complexity than the original formula. This means that complex-
ity is a good measure for an induction, like the size of a proof.
Now, suppose we have a proof containing just one indirect pair, intro-
ducing and eliminating A → B, and suppose that otherwise, π1 (the
proof of B from A) and π2 (the proof of A) are normal.
[A](i)
· ·
· π1 · π2
· ·
B · after: A
before: · π2 ·
→I,i · · π1
A→B A ·
→E B
B
Unfortunately, the new proof is not necessarily normal. The new proof
is non-normal if π2 ends in the introduction of A, while π1 starts off
with the elimination of A. Notice, however, that the non-normality of
the new proof is, somehow, smaller. There is no non-normality with
respect to A → B, or any other formula that complex. The potential
non-normality is with respect to a subformula A. This result would
still hold if the proofs π1 and π2 weren’t normal themselves, but when
they might have →I/→E pairs for formulas less complex than A → B.
If A → B is the most complex detour formula in the original proof,
then the new proof has a smaller most complex detour formula.
For example, consider the proof with the following two detour formu-
las marked:
A → (A → B) [A](1)
→E
A→B [A](1) [A](2)
→E →I,2
B A→A A
→I,1 →E
A→B A
→E
B
To process them we can take them in any order. Eliminating the A → B,
we have
[A](2)
→I,2
A→A A [A](2)
→E →I,2
A → (A → B) A A→A A
→E →E
A→B A
→E
B
which now has two copies of the A → A to be reduced. However, these
copies do not overlap in scope (they cannot, as they are duplicated in
the place of assumptions discharged in an eliminated →I rule) so they
can be processed together. The result is the proof
A → (A → B) A
→E
A→B A
→E
B
Proof [sketch]: Take the detour formulas in π that are eliminated in the
To do: This proof sketch move to π1 or to π2 . For those not eliminated in the move to π1 , mark
should be made more precise. their corresponding occurrences in π1 . Similarly, mark the occurrences
of formulas in π2 that are detour formulas in π that are eliminated in
the move to π1 . Now eliminate the marked formulas in π1 and those
in π2 to produce the proof π 0 .
= = = =
π100 π1,1 π1,2 ··· π1,n
=
= = = =
π200 π2,1 π2,2 ··· π2,n
=
.. .. .. ..
. . . .
=
= = = =
00
πm πm,1 πm,2 ··· π∗
to find the desired proof π∗ . So, if πn0 and πn00 are normal they must be
identical.
We will prove that every proof is strongly normalising under the rela-
tion of deleting detour formulas. To assist in talking about this, we
need to make a few more definitions. First, the reduction tree.
definition 2.1.22 [reduction tree] The reduction tree (under ) of
a proof π is the tree whose branches are the reduction sequences on
the relation . So, from the root π we reach any proof accessible in
one step from π. From each π 0 where π π 0 , we branch similarly.
Each node has only finitely many successors as there are only finitely
many detour formulas in a proof. For each proof π, ν(π) is the size of
its reduction tree.
lemma 2.1.23 [the size of reduction trees] The reduction tree of a
strongly normalising proof is finite. It follows that not only is every
reduction path finite, but there is a longest reduction path.
c3 Suppose that π does not end in →I, and suppose that all of the
proofs reached from π in one step are red. Let σ be a red proof
of A. We wish to show that the proof (π σ) is red. By c1 for the
lemma 2.1.26 [red proofs ending in →I] If for each red proof σ of A,
the proof
·
·σ
·
π(σ) : A
·
·π
·
B
is red, then so is the proof
[A]
·
·π
τ: ·
B
→I
A→B
Proof: We show that the (τ σ) is red whenever σ is red. This will
suffice to show that the proof τ is red, by the definition of the predicate
‘red’ for proofs of A → B. We will show that every proof resulting
from (τ σ) in one step is red, and we will reason by induction on the
sum of the sizes of the reduction trees of π and σ. There are three
cases:
It follows also that every proof is strongly normalising, since all red
proofs are strongly normalising.
mean. It is relatively clear that we are treating the “x” as a marker for
the input of the function, and “x + 2” is the output. The function is the
output as it varies for different values of the input. Sometimes leaving
the variables there is not so useful. Consider the subtraction
x−y
You can think of this as the function that takes the input value x and
takes away y. Or you can think of it as the function that takes the input
value y and subtracts it from x. or you can think of it as the function
that takes two input values x and y, and takes the second away from
the first. Which do we mean? When we apply this function to the
input value 5, what is the result? For this reason, we have a way of
making explicit the different distinctions: it is the λ-notation, due to
Alonzo Church [17]. The function that takes the input value x and
returns x + 2 is denoted
λx.(x + 2)
The function taking the input value y and subtracts it from x is
λy.(x − y)
The function that takes two inputs and subtracts the second from the
first is
λx.λy.(x − y)
Notice how this function works. If you feed it the input 5, you get the
output λy.(5 − y). We can write application of a function to its input
by way of juxtaposition. The result is that
(λx.λy.(x − y) 5)
evaluates to the result λy.(5 − y). This is the function that subtracts
y from 5. When you feed this function the input 2 (i.e., you evaluate
(λy.(5 − y) 2)) the result is 5 − 2 — in other words, 3. So, functions can
have other functions as outputs.
Now, suppose you have a function f that takes two inputs y and z, and
we wish to consider what happens when you apply f to a pair where
the first value is the repeated as the second value. (If f is λx.λy.(x − y)
and the input value is a number, then the result should be 0.) We can
do this by applying f to the value x twice, to get ((f x) x). But this is
not a function, it is the result of applying f to x and x. If you consider
this as a function of x you get
λx.((f x) x)
This is the function that takes x and feeds it twice into f. But just as
functions can create other functions as outputs, there is no reason not
to make functions take other functions as inputs. The process here
was competely general — we knew nothing specific about f — so the
function
λy.λx.((y x) x)
Then, given the class of types, we can construct terms for each type.
[x : A](i)
M:A→B N:A ·
·
→E ·
(M N) : B M:B
→I,i
λx.M : A → B
There is no such simple typed λ-term. Were there such a term, then x
would have to both have type A → B and type A. But as things stand
now, a variable can have only one type. Not every λ-term is a typed
λ-term.
Now, it is clear that typed λ-terms stand in some interesting rela-
tionship to proofs. From any typed λ-term we can reconstruct a unique
definition 2.1.30 [from terms to proofs and back] For every typed
term M (of type A), we find proof(M) (of the formula A) as follows:
» proof(xA ) is the identity proof A.
Consider what this means for proofs. The term (λx.M N) immediately
β-reduces to M[x := N]. Representing this transformation as a proof,
we have
[x : A]
· ·
· πl · πr
· ·
M:B · N:A
· πr =⇒β ·
· · πl
λx.M : A → B N:A ·
M[x := N] : B
(λx.M N) : B
2.1.4 | history
Gentzen’s technique for natural deduction is not the only way to rep-
resent this kind of reasoning, with introduction and elimination rules
for connectives. Independently of Gentzen, the Polish logician, Stan-
isław Jaśkowski constructed a closely related, but different system for
presenting proofs in a natural deduction style. In Jaśkowski’s system, a
proof is a structured list of formulas. Each formula in the list is either
a supposition, or it follows from earlier formulas in the list by means
of the rule of modus ponens (conditional elimination), or it is proved
by conditionalisation. To prove something by conditionalisation you
first make a supposition of the antecedent: at this point you start a box.
The contents of a box constitute a proof, so if you want to use a for-
mula from outside the box, you may repeat a formula into the inside.
A conditionalisation step allows you to exit the box, discharging the
supposition you made upon entry. Boxes can be nested, as follows:
1. A → (A → B) Supposition
2. A Supposition
3. A → (A → B) 1, Repeat
4. A→B 2, 3, Modus Ponens
5. B 2, 4, Modus Ponens
6. A→B 2–5, Conditionalisation
7. (A → (A → B)) → (A → B) 1–6, Conditionalisation
This nesting of boxes, and repeating or reiteration of formulas to enter
boxes, is the distinctive feature of Jaśkowski’s system. Notice that we
could prove the formula (A → (A → B)) → (A → B) without using
a duplicate discharge. The formula A is used twice as a minor premise
in a Modus Ponens inference (on line 4, and on line 5), and it is then
discharged at line 6. In a Gentzen proof of the same formula, the as-
sumption A would have to be made twice.
Jaśkowski proofs also straightforwardly incorporate the effects of a
vacuous discharge in a Gentzen proof. We can prove A → (B → A)
using the rules as they stand, without making any special plea for a
vacuous discharge:
1. A Supposition
2. B Supposition
3. A 1, Repeat
4. B→A 2–3, Conditionalisation
5. A → (B → A) 1–4, Conditionalisation
The formula B is supposed, and it is not used in the proof that fol-
lows. The formula A on line 4 occurs after the formula B on line 3,
in the subproof, but it is harder to see that it is inferred from thatB.
Conditionalisation, in Jaśkowski’s system, colludes with reiteration to
allow the effect of vacuous discharge. It appears that the “fine control”
over inferential connections between formulas in proofs in a Gentzen
proof is somewhat obscured in the linearisation of a Jaśkowski proof.
The fact that one formula occurs after another says nothing about how
that formula is inferentially connected to its forbear.
Jaśkowski’s account of proof was modified in presentation by Fre-
deric Fitch (boxes become assumption lines to the left, and hence be-
come somewhat simpler to draw and to typeset). Fitch’s natural deduc-
tion ststem gained quite some popularity in undergraduate education
in logic in the 1960s and following decades in the United States [31].
Edward Lemmon’s text Beginning Logic [49] served a similar purpose
in British logic education. Lemmon’s account of natural deduction is
similar to this, except that it does without the need to reiterate by
breaking the box.
1 (1) A → (A → B) Assumption
2 (2) A Assumption
1,2 (3) A→B 1, 2, Modus Ponens
1,2 (4) B 2,3, Modus Ponens
1 (5) A→B 2, 4, Conditionalisation
(6) B 1, 5, Conditionalisation
2.1.5 | exercises
Working through these exercises will help you understand the material. I am not altogether confident about
As with all logic exercises, if you want to deepend your understanding the division of the exercises into “ba-
sic,” “intermediate,” and “advanced.”
of these techniques, you should attempt the exercises until they are I’d appreciate your feedback on
no longer difficult. So, attempt each of the different kinds of basic whether some exercises are too easy
exercises, until you know you can do them. Then move on to the or too difficult for their categories.
intermediate exercises, and so on. (The project exercises are not the
kind of thing that can be completed in one sitting.)
basic exercises
q1 Which of the following formulas have proofs with no premises?
1 : p → (p → p)
2 : p → (q → q)
3 : ((p → p) → p) → p
4 : ((p → q) → p) → p
5 : ((q → q) → p) → p
6 : ((p → q) → q) → p
7 : p → (q → (q → p))
8 : (p → q) → (p → (p → q))
9 : ((q → p) → p) → ((p → q) → q)
For each formula that can be proved, find a proof that complies with
the strictest discharge policy possible.
q2 Annotate your proofs from Exercise 1 with λ-terms. Find a most gen-
eral λ-term for each provable formula.
q3 Construct a proof from q → r to (q → (p → p)) → (q → r) using
vacuous discharge. Then construct a proof of B → (A → A) (also using
vacuous discharge). Combine the two proofs, using →E to deduce B →
C. Normalise the proof you find. Then annotate each proof with λ-
terms, and explain the β reductions of the terms corresponding to the
normalisation.
Then construct a proof from (p → r) → ((p → r) → q)) to (p → r) →
q using duplicate discharge. Then construct a proof from p → (q → r)
and p → q to p → r (also using duplicate discharge). Combine the two
proofs, using →E to deduce q. Normalise the proof you find. Then
annotate each proof with λ-terms, and explain the β reductions of the
terms corresponding to the normalisation.
q4 Find types and proofs for each of the following terms.
1 : λx.λy.x
2 : λx.λy.λz.((xz)(yz))
3 : λx.λy.λz.(x(yz))
4 : λx.λy.(yx)
5 : λx.λy.((yx)x)
Which of the proofs are linear, which are relevant and which are affine?
q5 Show that there is no normal relevant proof of these formulas.
1 : p → (q → p)
2 : (p → q) → (p → (r → q))
3 : p → (p → p)
q6 Show that there is no normal affine proof of these formulas.
1 : (p → q) → ((p → (q → r)) → (p → r))
2 : (p → (p → q)) → (p → q)
q7 Show that there is no normal proof of these formulas.
1 : ((p → q) → p) → p
2 : ((p → q) → q) → ((q → p) → p)
q8 Find a formula that can has both a relevant proof and an affine proof,
but no linear proof.
intermediate exercises
q9 Consider the following “truth tables.”
→ t n f → t n f → t n f
t t n f t t n n t t f f
n t t f n t t f n t n f
f t t t f t t t f t t t
gd3 ł3 rm3
A gd3 tautology is a formula that receives the value t in every gd3
valuation. An ł3 tautology is a formula that receives the value t in
every ł3 valuation. Show that every formula with a standard proof is
a gd3 tautology. Show that every formula with an affine proof is an ł3
tautology.
q10 Consider proofs that have paired steps of the form →E/→I. That is, a
conditional is eliminated only to be introduced again. The proof has a
sub-proof of the form of this proof fragment:
A→B [A](i)
→E
B
→I,i
A→B
These proofs contain redundancies too, but they may well be normal.
Call a proof with a pair like this circuitous. Show that all circuitous
proofs may be transformed into non-circuitous proofs with the same
premises and conclusion.
q11 In Exercise 5 you showed that there is no normal relevant proof of
p → (p → p). By normalisation, it follows that there is no relevant
proof (normal or not) of p → (p → p). Use this fact to explain why
it is more natural to consider relevant arguments with multisets of
premises and not just sets of premises. (hint: is the argument from
p, p to p relevantly valid?)
q12 You might think that “if . . . then . . . ” is a slender foundation upon
which to build an account of logical consequence. Remarkably, there
is rather a lot that you can do with implication alone, as these next
questions ask you to explore.
ˆ as follows: A∨B
First, define A∨B ˆ ::= (A → B) → B. In what way is
ˆ
“∨” like disjunction? What usual features of disjunction are not had
by ∨ˆ ? (Pay attention to the behaviour of ∨ ˆ with respect to different
discharge policies for implication.)
ˆ that do not involve
q13 Provide introduction and elimination rules for ∨
the conditional connective →.
q14 Now consider negation. Given an atom p, define the p-negation ¬p A
to be A → p. In what way is “¬p ” like negation? What usual features
of negation are not had by ¬p defined in this way? (Pay attention
to the behaviour of ¬ with respect to different discharge policies for
implication.)
A`B
One can read “A ` B” in a number of ways. You can say that B follows
from A, or that A entails B, or that the argument from A to B is valid.
The symbol used here is sometimes called the turnstile. “Scorning a turnstile wheel at her
Once we have the notion of consequence, we can ask ourselves reverend helm, she sported there
a tiller; and that tiller was in one
what properties consequence has. There are many different ways you mass, curiously carved from the long
could answer this question. The focus of this section will be a partic- narrow lower jaw of her hereditary
ular technique, originally due to Gerhard Gentzen. We can think of foe. The helmsman who steered by
that tiller in a tempest, felt like the
consequence—relative to a particular language—like this: when we Tartar, when he holds back his fiery
want to know about the relation of consequence, we first consider each steed by clutching its jaw. A noble
different kind of formula in the language. To make the discussion con- craft, but somehow a most melan-
choly! All noble things are touched
crete, let’s consider a very simple language: the language of proposi- with that.” — Herman Melville, Moby
tional logic with only two connectives, conjunction ∧ and disjunction Dick.
∨. That is, we will now look at formulas expressed in the following
grammar:
formula ::= atom | (formula ∧ formula) | (formula ∨ formula)
p ` p [Id]
L`C C`R
Cut
L`R
q`q
∧L1
q`q p`p p`p q∧r`q r`r
∧L2 ∧L1 ∧L1 ∧L2 ∧L2
p∧q`q p∧q`p p ∧ (q ∧ r) ` p p ∧ (q ∧ r) ` q q∧r`r
∧R ∧R ∧L2
p∧q`q∧p p ∧ (q ∧ r) ` p ∧ q p ∧ (q ∧ r) ` r
∧R
p ∧ (q ∧ r) ` (p ∧ q) ∧ r
Here are the cases for disjunction. The first derivation is for the com-
mutativity of disjunction, and the second is for associativity. (It is im-
portant to notice that these are not derivations of the commutativity or
associativity of conjunction or disjunction in general. They only show
the commutativity and associativity of conjunction and disjunction of
atomic formulas. These are not derivations of A ∧ B ` B ∧ A (for ex-
ample) since A ` A is not an axiom if A is a complex formula. We will
see more on this in the next section.)
q`q
∨R1
p`p q`q p`p q`q∨r r`r
∨R1 ∨R2 ∨R1 ∨R2 ∨R2
p`q∨p q`p∨q p ` p ∨ (q ∨ r) q ` p ∨ (q ∨ r) r`q∨r
∨L ∨L ∨R2
p∨q`q∨p p ∨ q ` p ∨ (q ∨ r) r ` p ∨ (q ∨ r)
∨L
(p ∨ q) ∨ r ` p ∨ (q ∨ r)
You can see that the disjunction derivations have the same structure
as those for conjunction. You can convert any derivation into another Exercise 14 on page 67 asks you to
(its dual) by swapping conjunction and disjunction, and swapping the make this duality precise.
left-hand side of the sequent with the right-hand side. Here are some
It’s not a complete derivation yet, as one leaf q∨(r1 ∧r2 ) ` q∨(r1 ∧r2 )
is not an axiom. However, we can add the derivation for it.
r1 ` r1 r2 ` r2
∧L1 ∧L2
r1 ∧ r2 ` r1 r1 ∧ r2 ` r2
∧R
q`q r1 ∧ r2 ` r1 ∧ r2
∨R1 ∨R2
q ` q ∨ (r1 ∧ r2 ) r1 ∧ r2 ` q ∨ (r1 ∧ r2 )
∨L
p`p q ∨ (r1 ∧ r2 ) ` q ∨ (r1 ∧ r2 )
p ∧ (q ∨ (r1 ∧ r2 )) ` p p ∧ (q ∨ (r1 ∧ r2 )) ` q ∨ (r1 ∧ r2 )
p ∧ (q ∨ (r1 ∧ r2 )) ` p ∧ (q ∨ (r1 ∧ r2 ))
the system is simple: In an axiomatic theory, it is always preferable These are part of a general story, to
to minimise the number of primitive assumptions. Here, it’s clear that be explored throughout this book, of
what it is to be a logical constant.
[IdA ] is derivable, so there is no need for it to be an axiom. A system These sorts of considerations have a
with fewer axioms is preferable to one with more, for the reason that long history [38].
we have reduced derivations to a smaller set of primitive notions.
Notice how much simpler this proof Proof: You can see this merely by looking at the rules. Each rule except
is than the proof of Theorem 2.1.11. for Cut has the subformula property.
On the other hand, there are very many different last inferences in a
derivation featuring Cut. The most trivial example is the derivation:
p`p
∨R1
p`p p`p∨q
Cut
p`p∨q
In this derivation the cut formula p ∨ (q ∧ A) is doing genuine work. Well, it’s doing work, in that p ∨
It is just repeating either the left formula p or the right formula q. (q ∧ A) is, for many choices for A,
genuinely intermediate between p
and p ∨ q. However, A is doing the
So, using Cut makes the search for derivations rather difficult. There kind of work that could be done
are very many more possible derivations of a sequent, and many more by any formula. Choosing different
actual derivations. The search space is much more constrained if we are values for A makes no difference
to the shape of the derivation. A is
looking for cut-free derivations instead. Constructing derivations, on
doing the kind of work that doesn’t
the other hand, is easier if we are permitted to use Cut. We have very require special qualifications.
many more options for constructing a derivation, since we are able to
pass through formulas “intermediate” between the desired antecedent
and consequent.
Do we need to use Cut? Is there anything derivable with Cut that
cannot be derived without it? Take a derivation involving Cut, such as
this one:
q`q
∧L1
p`p q∧r`q q`q
∧L1 ∧L2 ∧L1
p ∧ (q ∧ r) ` p p ∧ (q ∧ r) ` q p∧q`q
∧R ∨R1
p ∧ (q ∧ r) ` p ∧ q p∧q`q∨r
Cut
p ∧ (q ∧ r) ` q ∨ r
q`q
∧L1 q`q
q∧r`q ∧L1
∧L2 q∧r`q
p ∧ (q ∧ r) ` q q`q ∧L2
Cut p ∧ (q ∧ r) ` q
p ∧ (q ∧ r) ` q ∨R1
∨R1 p ∧ (q ∧ r) ` q ∨ r
p ∧ (q ∧ r) ` q ∨ r
Now we can proceed to present the technique for eliminating cuts from
a derivation. First we show that cuts may be moved upward in a deriv-
ation. Then we show that this process will terminate in a Cut-free
derivation.
· ·
· δl · δr
· ·
A`B B`C
A`C
To find our new derivation, we look at the formula B and its roles in
the final inference in δl and δr .
· 0 · 0 ·
· δl · δl · δr
· · ·
A1 ` B · A1 ` B B`C
· δr
before: ∧L1 · after: Cut
A1 ∧ A2 ` B B`C A1 ` C
Cut ∧L1
A1 ∧ A2 ` C A1 ∧ A2 ` C
The other two ways in which the cut formula could be passive are when
δ2 ends in [∨R] or [∧R]. The technique for these is identical to the
examples we have seen. The cut passes over [∨R] trivially, and it passes
over [∧R] by splitting into two cuts. In every instance, the depth is
reduced.
Here is an example:
are both derivable. Using the invertibility of ∧R, the sequent (b) this is
derivable only if (b1 ) q ∧ r ` p ∨ r and (b2 ) q ∧ r ` p are both derivable.
But (b2 ) is not derivable because q ` p and r ` p are underivable.
The elimination of cut is useful for more than just limiting the search
for derivations. The fact that any derivable sequent has a cut-free de-
rivation has other consequences. One consequence is the fact of inter-
polation.
corollary 2.2.15 [interpolation for lattice sequents] If a sequent
A ` B is derivable, then there is a formula C containing only atoms
present in both A and B such that A ` C and C ` B are derivable.
This result tells us that if the sequent A ` B is derivable then that
consequence “factors through” a statement in the vocabulary shared
between A and B. This means that the consequence A ` B not only
relies only upon the material in A and B and nothing else (that is due
to the availability of a cut-free derivation) but also in some sense the
derivation ‘factors through’ the material in common between A and B.
The result is a straightforward consequence of the cut-elimination the-
orem. A cut-free derivation of A ` B provides us with an interpolant.
p `p p [Id]
A `C R A `C R L `C1 A L `C2 B
∧L1 ∧L2 ∧R
A ∧ B `C R B ∧ A `C R L `C1 ∧C2 A ∧ B
A `C1 R B `C2 R L `C A L `C A
∨L ∨R1 ∨R2
A ∨ B `C1 ∨C2 R L `C A ∨ B L `C B ∨ A
p ∧ (q ∨ (r1 ∧ r2 )) ` (q ∨ r1 ) ∧ p, (q ∨ r1 ) ∧ p ` (q ∨ r1 ) ∧ (p ∨ r2 )
p ∧ (q ∨ (r1 ∧ r2 )) ` (q ∨ r1 ) ∧ (p ∨ r3 )
2.2.4 | history
[To be written.]
2.2.5 | exercises
basic exercises
q1 Show that there is no cut-free derivation of the following sequents
1 : p ∨ (q ∧ r) ` p ∧ (q ∨ r)
2 : p ∧ (q ∨ r) ` (p ∧ q) ∨ r
3 : p ∧ (q ∨ (p ∧ r)) ` (p ∧ q) ∨ (p ∧ r)
q2 Suppose that there is a derivation of A ` B. Let C(A) be a formula
containing A as a subformula, and let C(B) be that formula with the
subformula A replaced by B. Show that there is a derivation of C(A) `
C(B). Furthermore, show that a derivation of C(A) ` C(B) may be
systematically constructed from the derivation of A ` B together with
the context C(−) (the shape of the formula C(A) with a ‘hole’ in the
place of the subformula A).
q3 Find a derivation of p ∧ (q ∧ r) ` (p ∧ q) ∧ r. Find a derivation of
(p ∧ q) ∧ r ` p ∧ (q ∧ r). Put these two derivations together, with a
Cut, to show that p ∧ (q ∧ r) ` p ∧ (q ∧ r). Then eliminate the cuts
from this derivation. What do you get?
q4 Do the same thing with derivations of p ` (p∧q)∨p and (p∧q)∨p ` p.
What is the result when you eliminate this cut?
q5 Show that (1) A ` B ∧ C is derivable if and only if A ` B and A ` C is
derivable, and that (2) A ∨ B ` C is derivable if and only if A ` C and
B ` C are derivable. Finally, (3) when is A ∨ B ` C ∧ D derivable, in
terms of the derivability relations between A, B, C and D.
q6 Under what conditions do we have a derivation of A ` B when A con-
tains only propositional atoms and disjunctions and B contains only
propositional atoms and conjunctions.
q7 Expand the system with the following rules for the propositional con-
stants ⊥ and >.
A ` > [>R] ⊥ ` A [⊥L]
Show that Cut is eliminable from the new system. (You can think of
⊥ and > as zero-place connectives. In fact, there is a sense in which >
is a zero-place conjunction and ⊥ is a zero-place disjunction. Can you
see why?)
q8 Show that lattice sequents including > and ⊥ are decidable, follow-
ing Corollary 2.2.10 and the results of the previous question.
q9 Show that every formula composed of just >, ⊥, ∧ and ∨ is equivalent
to either > or ⊥. (What does this result remind you of?)
q10 Prove the interpolation theorem (Corollary 2.2.15) for derivations in-
volving ∧, ∨, > and ⊥.
q11 Expand the system with rules for a propositional connective with the
following rules:
A`R L`B
tonk L tonk R
A tonk B ` R L ` A tonk B
What new things can you derive using tonk? Can you derive A tonk See Arthur Prior’s “The Runabout
B ` A tonk B? Is Cut eliminable for formulas involving tonk? Inference-Ticket” [70] for tonk’s first
appearance in print.
q12 Expand the system with rules for a propositional connective with the
following rules:
A`R L`A L`B
honk L honk R
A honk B ` R L ` A honk B
What new things can you derive using honk? Can you derive A honk
B ` A honk B? Is Cut eliminable for formulas involving honk?
q13 Expand the system with rules for a propositional connective with the
following rules:
A`R B`R L`B
plonk L plonk R
A plonk B ` R L ` A plonk B
What new things can you derive using plonk? Can you derive A plonk
B ` A plonk B? Is Cut eliminable for formulas involving plonk?
intermediate exercises
q14 Give a formal, recursive definition of the dual of a sequent, and the
dual of a derivation, in such a way that the dual of the sequent p1 ∧
(q1 ∨ r1 ) ` (p2 ∨ q2 ) ∧ r2 is the sequent (p2 ∧ q2 ) ∨ r2 ` p1 ∨ (q1 ∧ r1 ).
And then use this definition to prove the following theorem.
theorem 2.2.17 [duality for derivations] A sequent A ` B is deriv-
able if and only if its dual (A ` B)d is derivable. Furthermore, the dual
of the derivation of A ` B is a derivation of the dual of A ` B.
q15 Even though the distribution sequent p ∧ (q ∨ r) ` (p ∧ q) ∨ r is not
derivable (Example 2.2.11), some sequents of the form A ∧ (B ∨ C) `
(A ∧ B) ∨ C are derivable. Give an independent characterisation of the
triples hA, B, Ci such that A ∧ (B ∨ C) ` (A ∧ B) ∨ C is derivable.
A → B, A ` B
p ` p [Id]
X`A B, Y ` R X, A ` B
→L →R
A → B, X, Y ` R X`A→B
X`C C, Y ` R
Cut
X, Y ` R
many formulas may appear on the left hand side of the sequent. In
the rules in Figure 2.5, a formula appearing in the spots filled by p, A,
B, A → B, or C are active, and the formulas in the other positions —
filled by X, Y and R — are passive.
As we’ve seen, these rules can be understood as “talking about” the
natural deduction system. We can think of a derivation of the sequent
X ` A as a recipe for constructing a proof from X to A. We may define
a mapping, giving us for each derivation δ of X ` A a proof nd(δ) from
X to A.
definition 2.3.1 [nd : derivations → proofs] For any sequent deriv-
ation δ of X ` A, there is a natural deduction proof nd(δ) from the
premises X to the conclusion A. It is defined recursively by first choos-
ing nd of an identity derivation, and then, given nd of simpler deriva-
tions, we define nd of a derivation extending those derivations by →L,
→I, or Cut:
» If δ is an identity sequent p ` p, then nd(δ) is the proof with the
sole assumption p. This is a proof from p to p.
» If δ is a derivation
· 0
·δ
·
X, A ` B
→R
X`A→B
then we already have the proof nd(δ 0 ) from X, A to B. The proof
nd(δ), from X to A → B is the following:
X, [A](i)
·
· nd(δ 0 )
·
B
→I,i
A→B
» If δ is a derivation
· ·
· δ1 · δ2
· ·
X`A B, Y ` R
→L
A → B, X, Y ` R
X
·
· nd(δ3 )
·
C Y
·
· nd(δ4 )
·
R
δ? : p → p ` p → p nd(δ? ) : p → p
A ` A [Id+ ]
From now on, we will focus on liberal derivations, with the understand-
ing that we may “strictify” our derivations if the need or desire arises.
So, we have nd : derivations → proofs. This transformation also
sends cut-free derivations to normal proofs. This lends some support
to the view that derivations without cut and normal proofs are closely
p`p q`q
→L
p → q ` p → q p → q, p ` q
Cut
p → q, p ` q r`r
→L
q→r`q→r p → r, p → q, p ` r
Cut
p → q, q → r, p ` r
→R
p → q, q → r ` p → r
→R
p → q ` (q → r) → (p → r)
This contains redundant Cut steps (we applied Cuts to identity se-
quents, and these can be done away with). We can eliminate these,
to get a much simpler cutfree derivation:
p`p q`q
→L
p → q, p ` r r`r
→L
q → r, p → q, p ` r
→R
p → q, q → r ` p → r
→R
p → q ` (q → r) → (p → r)
You can check for yourself that when you apply nd to this derivation,
you construct the original proof.
So, we have transformed the proof π into a derivation δ, which
contained Cuts, and in this case, we eliminated them. Is there a way
to construct a cut-free derivation in the first place? It turns out that
there is. We need to construct the proof in a more subtle way than
unravelling it from the bottom.
lemma 2.3.5 [normal proof structure] Any normal proof, using the
rules →I and →E alone, is either an assumption, or ends in an →I step,
or contains an undischarged assumption that is the major premise of
an →E step.
Now we may define the different map sqp (“p” for “perimeter”) ac-
cording to which we strip each →I off the bottom of the proof π, until
we have no more to take, and then, instead of dealing with the →E at
the bottom of the proof, we deal with the the leftmost undischarged
major premise of an →E step, unless there is none.
Z
·
· π2
·
C→D C
→E
D Y
·
· π3
·
A
· p · p
· sq (π2 ) · sq (π3 )
· ·
Z`C Y, D ` A
→L
C → D, Z, Y ` A
theorem 2.3.7 For each natural deduction proof π from X to A, sqp (π)
is a derivation of the sequent X ` A. Furthermore, if π is normal,
sqp (π) is cut-free.
Now, the Cut on A → B may be traded in for two simpler cuts: one on
A and the other on B.
· ·
· δ2 · δ1
· ·
Y`A X, A ` B · 0
· δ2
Cut ·
X, Y ` B B, Z ` C
Cut
X, Y, Z ` C
cut formula passive on one side: There are more cases to consider
here, as there are more ways the cut formula can be passive in a deriva-
tion. The cut formula can be passive by occuring in X, Y , or R in either
[→L] or [→R]:
X`A B, Y ` R X, A ` B
→L →R
A → B, X, Y ` R X`A→B
So, let’s mark all of the different places that a cut formula could occur
passively in each of these inferences. The inferences in Figure 2.6 mark
the four different locations of a cut formula with C.
X 0, C ` A B, Y ` R X`A B, Y 0 , C ` R X`A B, Y ` C X 0 , C, A ` B
→L →L →L →R
A → B, X 0 , C, Y ` R A → B, X, Y 0 , C ` R A → B, X, Y ` C X 0, C ` A → B
In just the same way, we can motivate the structural rule of contraction
X, A, A ` B
W
X, A ` B
by going through A → B in just the same way. Why “W” for contraction and “K”
for weakening? It is due to Schön-
· finkel’s original notation for combin-
·δ ators [82].
·
X, A, A ` B A`A B`B
→R− →L
X`A→B A → B, A ` B
Cut
X, A ` B
With K and W we may use the old →R rule and ‘factor out’ the differ-
ent behaviour of discharge:
p`p q`q q`q
→L K
p`p p → q, p ` q q, r ` q
→L →R
p → (p → q), p, p ` q p`p q`r→q
W →L
p → (p → q), p ` q p → q, p ` r → q
→R →R
p → (p → q) ` p → q p → q ` p → (r → q)
· ·
· · δ1 · δ2
· δ2 · ·
· · X`A Y, A, A ` B
· · δ1
· δ1 Y, A, A ` B · Cut
· W X`A X, Y, A ` B
X`A Y, A ` B Cut
Cut X, X, Y ` B
X, Y ` B W, repeated
X, Y ` B
In this case, the new proof is not less complex than the old one. The
depth of the second cut in the new proof (2|δ1 | + |δ2 | + 1) is greater than
in the old one (|δ1 | + |δ2 |). The old proof of cut elimination no longer
works in the presence of contraction. There are number of options one
might take here. Gentzen’s own approach is to prove the elimination
of multiple applications of cut.
· ·
· δ1 · δ2
· ·
X`A Y, A, A ` B
Multicut
X, X, Y ` B
W, repeated
X, Y ` B
A B A∧B A∧B
∧I ∧E ∧E
A∧B A B
A B
∧I
A∧B
∧E
A
X, A ` R X, A ` R X`A X`B
∧L1 ∧L2 ∧R
X, A ∧ B ` R X, B ∧ A ` R X`A∧B
X, A ` R X, B ` R X`A X`A
∨L ∨R1 ∨R2
X, A ∨ B ` R X`A∨B X`B∨A
These rules are the generalisation of the lattice rules for conjunc-
tion seen in the previous section. Every sequent derivation in the old
system is a proof here, in which there is only one formula in the ante-
cedent multiset. We may prove many new things, given the interaction
of implication and the lattice connectives:
p`p q`q p`p r`r
→L →L
p → q, p ` q p → r, p ` r
∧L ∧L
(p → q) ∧ (p → r), p ` q (p → q) ∧ (p → r), p ` r
∧R
(p → q) ∧ (p → r), p ` q ∧ r
→R
(p → q) ∧ (p → r) ` p → (q ∧ r)
2.3.4 | negation
You can get some of the features of negation by defining it in terms
of conditionals. If we pick a particular atomic proposition (call it f for
(using contraction, we can derive A∧¬A ` too) but we must stop there
in the absence of more rules. To get from here to A ⊗ ¬A ` B, we must
somehow add B into the conclusion. But the B is not there! How can
we do this? We can come close by adding B to the left by means of a
weakening move:
A`A
¬L
A, ¬A `
K
A, ¬A, B `
¬R
A, ¬A ` ¬B
This shows us that a contradiction entails any negation. But to show
that a contradiction entails anything we need a little more. We can do
this by means of a structural rule operating on the right-hand side of
a sequent. Now that we have sequents with empty right-hand sides,
we may perhaps add things in that position, just as we can add things
on the left by means of a weakening on the right. The rule of right
weakening is just what is required to derive A, ¬A ` B.
X`
KR
X`B
X`C C, X 0 ` R
Identity and Cut p ` p [Id] Cut
X, X 0 ` R
X`A B, X 0 ` R X, A ` B
Conditional Rules →L →R
A → B, X, X 0 ` R X`A→B
X`A X, A `
Negation Rules ¬L ¬R
X, ¬A ` X ` ¬A
X, A ` R X, A ` R X`A X`B
Conjunction Rules ∧L1 ∧L2 ∧R
X, A ∧ B ` R X, B ∧ A ` R X`A∧B
X, A ` R X, B ` R X`A X`A
Disjunction Rules ∨L ∨R1 ∨R2
X, A ∨ B ` R X`A∨B X`B∨A
X, A, A ` R X`R X`
Structural Rules WL KL KR
X, A ` R X, A ` R X`C
A case could be made for the claim that intuitionistic logic is the
strongest logic is the strongest and most natural logic you can motivate
using inference rules on sequents of the form X ` R. It is possible to go
further and to add rules to ensure that the connectives behave as one
would expect given the rules of classical logic: we can add the rule of
double negation elimination This is equivalent to the natural de-
duction rule admitting the inference
X ` ¬¬A from ¬¬A to A, used in many sys-
DNE tems of natural deduction for clas-
X`A sical logic.
which strengthens the system far enough to be able to derive all clas-
sical tautologies and to derive all classically valid sequents. However,
the results are not particularly attractive on proof-theoretical consider-
ations. For example, the rule DNE does not satisfy the subformula
property: the concluding sequent X ` A is derived by way of the
p`p
KL
p, (p → q) → p ` p
→R
p ` ((p → q) → p) → p
¬L
¬(((p → q) → p) → p), p `
KR
¬(((p → q) → p) → p), p ` q
→R
¬(((p → q) → p) → p) ` p → q p`p
→L
¬(((p → q) → p) → p), (p → q) → p ` p
→R
¬(((p → q) → p) → p) ` ((p → q) → p) → p
¬L
¬(((p → q) → p) → p), ¬(((p → q) → p) → p) `
WL
¬(((p → q) → p) → p) `
¬R
` ¬¬(((p → q) → p) → p)
DNE
` ((p → q) → p) → p
proposition, ‘A’, ‘B’ and ‘C’ are formulas, X, X 0 , Y and Y 0 are multisets
(possibly empty), of formulas.
X ` Y, C C, X 0 ` Y 0
Identity and Cut p ` p [Id] Cut
X, X 0 ` Y, Y 0
X ` Y, A B, X 0 ` Y 0 X, A ` B, Y
Conditional Rules →L →R
X, X 0 , A → B ` Y, Y 0 X ` A → B, Y
X ` A, Y X, A ` Y
Negation Rules ¬L ¬R
X, ¬A ` Y X ` ¬A, Y
X, A ` Y X, A ` Y X ` A, Y X 0 ` B, Y 0
Conjunction Rules ∧L1 ∧L2 ∧R
X, A ∧ B ` Y X, B ∧ A ` Y X, X 0 ` A ∧ B, Y, Y 0
X, A ` Y X, B ` Y X ` A, Y X ` A, Y
Disjunction Rules ∨L ∨R1 ∨R2
X, A ∨ B ` Y X ` A ∨ B, Y X ` B ∨ A, Y
X, A, A ` Y X ` A, A, Y X`Y X`Y
Structural Rules WL WR KL KR
X, A ` Y X ` A, Y X, A ` Y X ` A, Y
Using these rules, we can derive Peirce’s Law, keeping the structure
of the old derivation intact, other than the deletion of all of the steps
involving negation. Instead of having to swing the formula for Peirce’s
Law onto the left to duplicate it in a contraction step, we may keep it
on the right of the turnstile to perform the duplication. The negation
laws are eliminated, the WL step changes into a WR step, but the other
rules are unchanged. You might think that this is what
we were ‘trying’ to do in the
p`p other derivation, and we had to
KL be sneaky with negation to do what
p, (p → q) → p ` p we wished.
KR
p, (p → q) → p ` q, p
→R
p ` q, ((p → q) → p) → p
→R
` p → q, ((p → q) → p) → p p`p
→L
(p → q) → p ` p, ((p → q) → p) → p
→R
` ((p → q) → p) → p, ((p → q) → p) → p
WR
` ((p → q) → p) → p
The sequent rules for classical logic share the ‘true–false’ duality im-
plicit in the truth-table account of classical validity. But this leads on
to an important question. Intuitionistic sequents, of the form X ` A,
record a proof from X to A. What do classical sequents mean? Do they
mean anything at all about proofs? A sequent of the form A, B ` C, D
does not tell us that C and D both follow from A and B. (Then it could
be replaced by the two sequents A, B ` C and A, B ` D.) No, the se-
quent A, B ` C, D may be valid even when A, B ` C and A, B ` D are
not valid. The combination of the conclusions is disjunctive and not
conjunctive when read ‘positively’. We can think of a sequent X ` Y as
proclaiming that if each member of X is true then some member of Y
is true. Or to put it ‘negatively’, it tells us that it would be a mistake
to assert each member of X and to deny each member of Y .
This leaves open the important question: is there any notion of
proof appropriate for structures like these, in which premises and con-
clusions are collected in exactly the same way? Whatever is suitable
will have to be quite different from the tree-structured proofs we have
already seen.
X ` Y, C C, X ` Y
Identity and Cut X, A ` A, Y [Id] Cut
X`Y
X ` Y, A B, X ` Y X, A ` B, Y
Conditional Rules →L →R
A → B, X ` Y X ` A → B, Y
X ` A, Y X, A ` Y
Negation Rules ¬L ¬R
X, ¬A ` Y X ` ¬A, Y
X, A, B ` Y X ` A, Y X ` B, Y
Conjunction Rules ∧L ∧R
X, A ∧ B ` Y X ` A ∧ B, Y
X, A ` Y X, B ` Y X ` A, B, Y
Disjunction Rules ∨L ∨R
X, A ∨ B ` Y X ` A ∨ B, Y
X, A, A ` Y X ` A, A, Y
Structural Rules WL WR
X, A ` Y X ` A, Y
too difficult to prove: simply show that the new identity axioms of the
system in Figure 8 may be derived using our old identity together with
instances of weakening; and that if the premises of any of the new rules
are derivable, so are the conclusions, using the corresponding rule from
the old system, and perhaps using judicious applications of contraction
to manipulate the parameters.
The new sequent system has some very interesting properties. Sup-
pose we have a sequent X ` Y , that has no derivation (not using Cut)
in this system. then we may reason in the following way:
That deals with what we might call atomic sequents. We now proceed
by induction, with the hypothesis for a sequent X ` Y being that if
it has no derivation, it is truth-table invalid. And we will show that
if the hypothesis holds for simpler sequents than X ` Y then it holds
for X ` Y too. What is a simpler sequent than X ` Y ? Let’s say that
the complexity of a sequent is the number of connectives (∧, ∨, →, ¬)
occuring in that sequent. So, we have shown that the hypothesis holds
for sequents of complexity zero.
Now to deal with sequents of greater complexity: that is, those
containing formulas with connectives.
So, the sequent rules, read backwards from bottom-to-top, can be un-
derstood as giving instructions for making a counterexample to a se-
The similarity to rules for quent. In the case of sequent rules with more than one premise, these
tableaux is not an accident [85]. instructions provide alternatives which can both be explored. If a se-
See Exercise 10 on page 97.
quent is underivable, these instructions may be followed to the end,
and we finish with a counterexample to the sequent. If following the
instructions does not meet with success, this means that all searches
have terminated with derivable sequents. So we may play this attempt
backwards, and we have a derivation of the sequent.
2.3.7 | history
[To be written.]
2.3.8 | exercises
basic exercises
q1 Which of the following sequents can be proved in intuitionistic logic?
For those that can, find a derivation. For those that cannot, find a
derivation in classical sequent calculus:
1 : p → (q → p ∧ q)
2 : ¬(p ∧ ¬p)
3 : p ∨ ¬p
4 : (p → q) → ((p ∧ r) → (q ∧ r))
5 : ¬¬¬p → ¬p
6 : ¬(p ∨ q) → (¬p ∧ ¬q)
7 : (p ∧ (q → r)) → (q → (p ∧ r))
8 : p ∨ (p → q)
9 : (¬p ∨ q) → (p → q)
10 : ((p ∧ q) → r) → ((p → r) ∨ (q → r))
q2 Consider all of the formulas unprovable in q1 on page 47. Find deriva-
tions for these formulas, using classical logic if necessary.
q3 Define the dual of a classical sequent in a way generalising the result
of Exercise 14 on page 67, and show that the dual of a derivation of a
sequent is a derivation of the dual of a sequent. What is the dual of a
formula involving implication?
q4 Define A →∗ B as ¬(A ∧ ¬B). Show that any classical derivation of
X ` Y may be transformed into a classical derivation of X∗ ` Y ∗ , where
X∗ and Y ∗ are the multisets X and Y respectively, with all instances
of the connective → replaced by →∗ . Take care to explain what the
transformation does with the rules for implication. Does this work for
intuitionistic derivations?
q5 Consider the rules for classical propositional logic in Figure 2.11. De-
lete the rules for negation. What is the resulting logic like? How does
it differ from intuitionistic logic, if at all?
q6 Define the Double Negation Translation d(A) of formula A as follows:
d(p) = ¬¬p
d(¬A) = ¬d(A)
d(A ∧ B) = d(A) ∧ d(B)
d(A ∨ B) = ¬(¬d(A) ∧ ¬d(B))
d(A → B) = d(A) → d(B)
intermediate exercises
q7 Using the double negation translation d of the previous question, show
how a classical derivation of X ` Y may be transformed (with a num-
ber of intermediate steps) into an intuitionistic derivation of Xd ` Y d ,
where Xd and Y d are the multisets of the d-translations of each ele-
ment of X, and of Y respectively.
q8 Consider the alternative rules for classical logic, given in Figure 2.12.
Show that X ` Y is derivable using these rules iff it is derivable using
the old rules. Which of these new rules are invertible? What are some
distinctive properties of these rules?
q9 Construct a system of rules for intuitionistic logic with as similar as
you can to the classical system in Figure 2.12. Is it quite as nice? Why,
or why not?
q10 Relate cut-free sequent derivations of X ` Y with tableaux refutations
of X, ¬Y [44, 78, 85]. Show how to transform any cut-free sequent
2.4 | circuits
In this section we will look at the kinds of proofs motivated by the two-
sided classical sequent calculus. Our aim is to “complete the square.”
Derivations of X ` A ↔ Proofs from X to A
Derivations of X ` Y ↔ ???
Just what goes in that corner? If the parallel is to work, the structure
is not a straightforward tree with premises at the top and conclusion at
the bottom, as we have in proofs for a single conclusion A. What other
structure could it be?
For the first example of proofs with multiple conclusions as well
as multiple premises. We will not look at the case of classical logic,
for the presence of the structural rules of weakening and contraction
complicates the picture somewhat. Instead, we will start with a logic
without these structural rules—linear logic.
§2.4 · circuits 99
The cut formula (here it is A) is left out, and all of the other material
remains behind. Any use of the cut rule is eliminable, in the usual
manner. Notice that this proof system has no conditional connective.
Its loss is no great thing, as we could define A → B to be ¬(A ⊗ ¬B),
or equivalently, as ¬A ⊕ B. (It is a useful exercise to verify that these
definitions are equivalent, and that they both “do the right thing” by
inducing appropriate rules [→E] and [→I].) So that is our sequent sys-
tem for the moment.
Let’s try to find a notion of proof appropriate for the derivations
in this sequent system. It is clear that the traditional many-premise
single-conclusion structure does not fit neatly. The cut free derivation
of ¬¬A ` A is no simpler and no more complex than the cut free
derivation of A ` ¬¬A.
A`A A`A
¬R ¬L
` ¬A, A ¬A, A `
¬L ¬R
¬¬A ` A A ` ¬¬A
The natural deduction proof from A to ¬¬A goes through a stage
where we have two premises A and ¬A and has no active conclusion
(or equivalently, it has the conclusion ⊥).
A [¬A](1)
¬E
∗
¬I,1
¬¬A
In this proof, the premise ¬A is then discharged or somehow otherwise
converted to the conclusion ¬¬A. The usual natural deduction proofs
from ¬¬A to A are either simpler (we have a primitive inference from
¬¬A to A) or more complicated. A proof that stands to the derivation
of ¬¬A ` A would require a stage at which there is no premise but
two conclusions. We can get a hint of the desired “proof” by turning
the proof for double negation introduction on its head:
¬¬A
¬I,1
∗
¬E
[¬A](1) A
Let’s make it easier to read by turning the formulas and labels the right
way around, and swap I labels with E labels:
¬¬A
¬E,1
∗
¬I
¬A
]( [
)
1 A
We are after a proof of double negation elimination at least as simple
as this. However, constructing this will require hard work. Notice that
not only does a proof have a different structure to the natural deduc-
tion proofs we have seen—there is downward branching, not upward—
there is also the kind of “reverse discharge” at the bottom of the tree
which seems difficult to interpret. Can we make out a story like this?
Can we define proofs appropriate to linear logic?
To see what is involved in answering this question in the affirmat-
ive, we will think more broadly to see what might be appropriate in
designing our proof system. Our starting point is the behaviour of
each rule in the sequent system. Think of a derivation ending in X ` Y
as having constructed a proof π with the formulas in X as premises
or inputs and the formulas in Y as conclusions, or outputs. We could
think of a proof as having a shape reminiscent of the traditional proofs
from many premises to a single conclusion:
A1 A2 ··· An
B1 B2 ··· Bm
However, chaining proofs together like this is notationally very dif-
ficult to depict. Consider the way in which the sequent rule [Cut]
corresponds to the composition of proofs. In the single-formula-right
sequent system, a Cut step like this:
X`C A, C, B ` D
Cut
A, X, B ` D
corresponds to the composition of the proofs
X
X A C B π1
π1 and π2 to form A C B
C D π2
D
In the case of proofs with multiple premises and multiple conclusions,
this notation becomes difficult if not impossible. The cut rule has an
instance like this:
X ` D, C, E A, C, B ` Y
Cut
A, X, B ` D, Y, E
This should correspond to the composition of the proofs
X A C B
π1 and π2
D C E Y
If we are free to rearrange the order of the conclusions and premises,
we could manage to represent the cut:
X
π1
D E C A C
π2
Y
It turns out that it is much more flexible to change our notation com-
pletely. Instead of representing proofs as consisting of characters on a
page, ordered in a tree diagrams, think of proofs as taking inputs and
outputs, where we represent the inputs and outputs as wires. Wires can
be rearranged willy-nilly—we are all familiar with the tangle of cables
behind the stereo or under the computer desk—so we can exploit this
to represent cut straightforwardly. In our pictures, then, formulas will
label wires. This change of representation will afford another insight:
instead of thinking of the rules as labelling transitions between formu-
las in a proof, we will think of inference steps (instances of our rules)
as nodes with wires coming in and wires going out. Proofs are then
circuits composed of wirings of nodes. Figure 2.13 should give you the
idea.
X X0
X π1
A
π
Y π2
Y Y0
Draw for yourself the result of mak- A proof π for the sequent X ` Y has premise or input wires for
ing two cuts, one after another, in- each formula in X, and conclusion or output wires for each formula in
ferring from the sequents X1 ` A, Y1
and X2 , A ` B, Y2 and X3 , B `, Y3 to Y . Now think of the contribution of each rule to the development of
the sequent X1 , X2 , X3 ` Y1 , Y2 , Y3 . inferences. The cut rule is the simplest. Given two proofs, π1 from X
You get two different possible de- to A, Y , and π2 from X 0 , A to Y 0 , we get a new proof by chaining them
rivations with different intermedi-
ate steps depending on whether together. You can depict this by “plugging in” the A output of π1 into
you cut on A first or on B first. the A input of π2 . The remaining material stays fixed. In fact, this
Does the order of the cuts mat- picture still makes sense if the cut wire A occurs in the middle of the
ter when these different deriva-
tions are represented as circuits?
output wires of π1 and in the middle of the input wires of π2 .
X X0
π1
A
π2
Y Y0
Now consider the behaviour of the connective rules. For negation, the
behaviour is simple. An application of a negation rule turns an output
A into an input ¬A (this is ¬L), or an input A into an output ¬A (this
is ¬R). So, we can think of these steps as plugging in new nodes in
the circuit. A [¬E] node takes an input A and input ¬A (and has no
outputs), while a [¬I] node has an output A and and output ¬A (and
has no inputs). In other words, these nodes may be represented in the
following way: Ignore, for the moment, the little
green squares on the surface of the
node, and the shade of the nodes.
¬A A ¬i These features have a significance
¬A A
¬e which will be revealed in good time.
X X
X ` A, Y
¬L π becomes π
X, ¬A ` Y
¬A A
A Y ¬e Y
A X
¬i X
¬A A
X, A ` Y
¬R π becomes π
X ` ¬A, Y
Y Y
and it can be used to combine circuits in the manner of the [⊗R] se-
quent rule:
X X0
X X0
π0 π π0
π becomes
A B
Y ⊗i Y0
Y A B Y0
A⊗B
except for the notational variance and the possibility that it might be
employed in a context in which there are conclusions alongside A ∧ B.
The rule [⊗E], on the other hand, is novel. This rule takes a single
proof π with the two premises A and B and modifies it by wiring to-
gether the inputs A and B into a node which has a single input A ⊗ B.
It follows that we have a node [⊗E] with a single input A ⊗ B and two
outputs A and B.
A⊗B
X A B X ⊗e
A B
π becomes π
Y Y
Ignore, for the moment, the different In this case the relevant node has one input and two outputs:
colour of this node, and the two
small circles on the surface of the
A⊗B
node where the A and B wires join.
All will be explained in good time. ⊗e
A B
This is not a mere variant of the rules [∧E] in traditional natural deduc-
tion. It is novel. It corresponds to the other kind of natural deduction
rule
[A, B]
·
·
·
A⊗B C
C
in which two premises A and B are discharged, and the new premise
A ⊗ B is used in its place.
The extent of the novelty of this rule becomes apparent when you
see that the circuit for [⊕E] also has one input and two outputs, and
the two outputs are A and B, if the input is A ⊕ B. The step for [⊕L]
takes two proofs: π1 with a premise A and π2 with a premise B, and
combines them into a proof with the single premise A ⊗ B. So the node
for [⊗E] looks identical. It has a single input wire (in this case, A ⊗ B),
and two output wires, A and B
A⊕B
X A B X0 X ⊕e X0
A B
π π0 becomes π π0
Y Y0 Y Y0
The same happens with the rule to introduce a disjunction. The se-
quent step [⊕R] converts the two conclusions A, B into the one conclu-
sion A ⊕ B. So, if we have a proof π with two conclusion wires A and
B, we can plug these into a [⊕I] node, which has two input wires A and
B and a single output wire A ⊕ B.
X X
π becomes π
A B
Y A B Y ⊕i
A⊕B
Notice that this looks just like the node for [⊗I]. Yet ⊗ and ⊕ are very
different connectives. The difference between the two nodes is due to
the different ways that they are added to a circuit.
A`A B`B
A, ¬A ` B, ¬B `
A, B, ¬A ⊕ ¬B `
A, B ` ¬(¬A ⊕ ¬B)
A ⊗ B ` ¬(¬A ⊕ ¬B)
A⊗B
⊗e
A B
¬e ¬e
¬A ¬B
⊕e
¬A ⊕ ¬B
¬i
¬(¬A ⊕ ¬B)
defines exactly the same circuit. The map from derivations to circuits
is many-to-one.
Notice that the inductive construction of proof circuits provides for a
difference for ⊕ and ⊗ rules. The nodes [⊗I] and [⊕E] combine dif-
ferent proof circuits, and [⊗E] and [⊕I] attach to a single proof circuit.
This means that [⊗E] and [⊕I] are parasitic. They do not constitute
a proof by themselves. (There is no linear derivation that consists
merely of the step [⊕R], or solely of [⊗L], since all axioms are of the
form A ` A.) This is unlike [⊕L] and [⊗R] which can make fine proofs
on their own.
Not everything that you can make out of the basic nodes is a circuit
corresponding to a derivation. Not every “circuit” (in the broad sense)
is inductively generated.
A⊕B A⊗B
⊕e ⊗e ¬i
A B A B A ¬A
⊗i ⊕i ¬e
A⊗B A⊕B
A⊗B
⊗e
A⊗B
A B
⊗e
A B A`A B`B
¬e ¬e ¬A, A ` ¬B, B `
¬A ¬B
¬A ¬B
⊕e ¬A ` ¬A ¬B ` ¬B
¬A ⊕ ¬B
¬A ⊕ ¬B ` ¬A, ¬B
¬i ¬A ⊕ ¬B
¬(¬A ⊕ ¬B)
¬A ⊕ ¬B ` ¬A ⊕ ¬B
` ¬(¬A ⊕ ¬B), ¬A ⊕ ¬B
¬(¬A ⊕ ¬B)
A⊗B
A ⊗e B
A`A B`B
¬A, A ` ¬B, B `
¬A ¬B
¬A ` ¬A ¬B ` ¬B ¬A ⊕ ¬B ` ¬A ⊕ ¬B
¬A ⊕ ¬B ` ¬A, ¬B ` ¬(¬A ⊕ ¬B), ¬A ⊕ ¬B
` ¬(¬A ⊕ ¬B), ¬A, ¬B
¬(¬A ⊕ ¬B)
Now at last, the switched node ⊗E has both output arrows linked to
the one derivation. This means that we have a derivation of a sequent
with both A and B on the left. We can complete the derivation with a
[⊗L] step. The result is in Figure 2.16.
[diagram to go here]
⊗e
A B
¬A ` ¬A ¬B ` ¬B ¬A ⊕ ¬B ` ¬A ⊕ ¬B
¬A ⊕ ¬B ` ¬A, ¬B ` ¬(¬A ⊕ ¬B), ¬A ⊕ ¬B A`A
` ¬(¬A ⊕ ¬B), ¬A, ¬B ¬A, A ` B`B
A ` ¬(¬A ⊕ ¬B), ¬B ¬B, B `
A, B ` ¬(¬A ⊕ ¬B)
¬(¬A ⊕ ¬B)
[diagram to go here]
¬A ` ¬A ¬B ` ¬B ¬A ⊕ ¬B ` ¬A ⊕ ¬B
⊕L ¬L
¬A ⊕ ¬B ` ¬A, ¬B ` ¬(¬A ⊕ ¬B), ¬A ⊕ ¬B A`A
Cut ¬L
` ¬(¬A ⊕ ¬B), ¬A, ¬B ¬A, A ` B`B
Cut ¬L
A ` ¬(¬A ⊕ ¬B), ¬B ¬B, B `
Cut
A, B ` ¬(¬A ⊕ ¬B)
⊗L
A ⊗ B ` ¬(¬A ⊕ ¬B)
⊗i ⊕i ¬i
A⊗B A⊕B A ¬A A
⊗e ⊕e ¬e
A B A B
A
A B A B
[add an example]
¬A A
¬E ¬I
¬A A
A∧B
A∧B
A B
∧I ∧ E1 ∧ E2
A∧B A B
A B
A∨B
∨I1 ∨I2
∧I
A B
A∨B A∨B
The inputs of a node are those wires pointing into the node, and
the outputs of a node are those wires pointing out.
X X
B
π ki π ke
Y B Y
3 Using an unlinked weakening node like this makes some circuits disconnected.
It also forces a great number of different sequent derivations to be represented by
the same circuit. Any derivation of a sequent of the form X ` Y, B in which B is
weakened in at the last step will construct the same circuit as a derivation in which
B is weakened in at an earlier step. If this identification is not desired, then a more
complicated presentation of weakening, using the ‘supporting wire’ of Blute, Cockett,
Seely and Trimble [8] is possible. Here, I opt for a simple presentation of circuits
rather than a comprehensive account of “proof identity.”
¬I A ∧ ¬A A ∧ ¬A ¬I ¬I ¬I
A ∧ ¬A A ∧ ¬A
∧ E1 ∧ E2 ∧ E1 ∧ E2
A ¬A A ¬A
¬E ¬E
¬(A ∧ ¬A) ¬(A ∧ ¬A)
¬(A ∧ ¬A) ¬(A ∧ ¬A)
WI WI
¬(A ∧ ¬A) ¬(A ∧ ¬A)
∧I
¬(A ∧ ¬A) ∧ ¬(A ∧ ¬A)
` p, ¬p [Id]
` X, A, B ` X, A ` X 0, B
⊕R ⊗R
` X, A ⊕ B ` X, X 0 , A ⊗ B
The circuits are also much simpler. They only have outputs and no
inputs. These are Girard’s proofnets [35].
2.4.7 | exercises
basic exercises
q1 Construct circuits for the following sequents:
1 : ` p ⊕ ¬p
2 : p ⊗ ¬p `
3 : ¬¬p ` p
4 : p ` ¬¬p
5 : ¬(p ⊗ q) ` ¬p ⊕ ¬q
6 : ¬p ⊕ ¬q ` ¬(p ⊗ q)
7 : ¬(p ⊕ q) ` ¬p ⊗ ¬q
8 : ¬p ⊗ ¬q ` ¬(p ⊕ q)
9 : p⊗q`q⊗p
10 : p ⊕ (q ⊕ r) ` p ⊕ (q ⊕ r)
q2 Show that every formula A in the language ⊕, ⊗, ¬ is equivalent to a
formula n(A) in which the only negations are on atomic formulas.
q3 For every formula A, construct a circuit encodeA from A to n(A), and
decodeA from n(A) to A. Show that encodeA composed with decodeA
normalises to the identity arrow A , and that decodeA composed
n(A)
with encodeA normalises to . (If this doesn’t work for the encode
and decode circuits you chose, then try again.)
q4 Given a circuit π1 for A1 ` B1 and a circuit π2 for A2 ` B2 , show how
to construct a circuit for A1 ⊗ A2 ` B1 ⊗ B2 by adding two more nodes.
Call this new circuit π1 ⊗π2 . Now, suppose that τ1 is a proof from B1 to
C1 , and τ2 is a proof from B2 to C2 . What is the relationship between
the proof (π1 ⊗ π2 ) · (τ1 ⊗ τ2 ) (composing the two proofs π1 ⊗ π2 and
τ1 ⊗ τ2 with a cut on B1 ⊗ B2 ) from A1 ⊗ A2 to C1 ⊗ C2 and the proof
(π1 · τ1 ) ⊗ (π2 ⊗ τ2 ), also from A1 ⊗ A2 to C1 ⊗ C2 ?
Prove the same result for ⊕ in place of ⊗. Is there a corresponding
fact for negation?
q5 Re-prove the results of all of the previous questions, replacing ⊗ by ∧
and ⊕ by ∨, using the rules for classical circuits. What difference does
this make?
q6 Construct classical circuits for the following sequents
1 : q ` p ∨ ¬p
2 : p ∧ ¬p ` q
3 : p ` (p ∧ q) ∨ (p ∧ ¬q)
4 : (p ∧ q) ∨ (p ∧ ¬q) ` p
5 : (p ∧ q) ∨ r ` p ∧ (q ∨ r)
6 : p ∧ (q ∨ r) ` (p ∧ q) ∨ r
intermediate exercises
q7 The following statement is a tautology:
¬ (p1,1 ∨ p1,2 ) ∧ (p2,1 ∨ p2,2 ) ∧ (p3,1 ∨ p3,2 ) ∧
¬(p1,1 ∧ p2,1 ) ∧ ¬(p1,1 ∧ p3,1 ) ∧ ¬(p2,1 ∧ p3,1 ) ∧ ¬(p1,2 ∧ p3,2 ) ∧ ¬(p2,2 ∧ p3,2 )
value of n. How does the proof increase in size as n gets larger? Are
there non-normal proofs of Pn that are significantly smaller than any
non-normal proofs of Pn ?
It can be proved systematically from the way that proofs are construc-
ted.
Proof: We first show that the simplest proofs have this property. That
is, given a proof that is just an assumption, we show that there is no
counterexample in truth tables. But this is obvious. A counterexample
for an assumption A would be a valuation such that v(A) was true and
was at the same time false. Truth tables do not allow this. So, mere
assumptions have the property of being truth table valid. Now, let’s
suppose that we have a proof whose last move is an elimination, from
A → B and A to B and let’s suppose that its constituent proofs, π1
from X to A → B and π2 from Y to A, are truth table valid. It remains
to show that our proof from X and Y to B is truth table valid. If it is
not, then we have a valuation v that makes each formula in X true, and
each formula in Y true, and that makes B false. This cannot be the case,
since v must make A either true or false. If it makes A false, then the
valuation is a counterexample to the argument from Y to A. (But we
have supposed that this argument has no truth table counterexamples.)
On the other hand, if it makes A true, then it makes A → B false (since
B is false) and so, it is a counterexample to the argument from X to
A → B. (But again, we have supposed that this argument has no truth
table counterexamples.) So, we have shown that if our proofs π1 and
π2 are valid, then the result of extending it with an →E move is also
valid.
However, the converse is not the case. Some arguments that are valid
from the perspective of truth tables cannot be supplied with proofs.
Truth tables are good for sifting out some of the invalid arguments,
and for those arguments for which this techniques work, a simple
truth table counterexample is significantly more straightforward to
work with than a direct demonstration that there is no proof to be
found. Regardless, truth tables are a dull instrument. Many argu-
ments with no standard proofs are truth table valid. Here are two
examples: (A → B) → B ∴ (B → A) → A and (A → B) → A ∴ A.
Now we will look at ways to refute these arguments.
have the case for atoms. Now suppose we have A → B and the result
holds for A and B. We want to show that there is a proof for X ∴ A → B
if and only if for each Y where Y
A, we have X, Y
B. That is, we
wish to show that there is a proof for X ∴ A → B if and only if for
each Y where there’s a proof of Y ∴ A, there is also a proof of X, Y ∴ B.
From left-to-right it is straightforward. If there is a proof from X to
A → B and a proof from Y to A, then extend it by →E to form a proof
from X, Y to B. From right-to-left we may assume that for any Y , if
there’s a proof for Y ∴ A, then there is a proof X, Y ∴ B. Well, there
is a proof of A ∴ A, so it follows that there’s a proof of X, A ∴ B. Use
that proof and apply →I, to construct a proof of X ∴ A → B.
So, our structure is a model in which X
A in general if and only
if there is a proof for X ∴ A. It is an easy induction on the structure
of X to show that X
X. It follows, then that if there is no proof for
X ∴ A, then X itself is a point at which X
X but X 6
A. We have a
counterexample to any invalid argument.
Consider the other discharge policies. If we allow vacuous dis-
charge, then it is straightforward to show that our model satisfies the
preservation condition. If X ∴ A is valid, so is X, B ∴ A. If X ∴ A is
valid, we may discharge a non-appearing B to find X ∴ B → A. We
may then use an assumption of B to deduce X, B ∴ A.
X
·
·
·
A
→I,1
B→A B
→E
A
So, in this model, if vacuous discharge is allowed, the preservation
condition is satisfied. So, we have an appropriate model for affine de-
ductions.
If we allow duplicate discharge, we must do a little work. Our
model we have constructed so far does not satisfy the contraction con-
dition, since the multiset A, A is not the same multiset as the singleton
A. Instead, we work simply with sets of formulas, and proceed as be-
fore. We must do a little more work when it comes to →E. We know
that if we have a proof for X ∴ A → B and one for Y ∴ A then we
have a proof from multiset union X, Y to B. Do we have one for the
set union too? We do, because for any proof from a list of premises to
a conclusion, if we allow duplicate discharges we can construct a proof
in which each premise is used only once.
X, [B, B](1)
·
·
·
A
→I,1
B→A B
→E
A
In this example, we trade in two uses of B in a proof from X, B, B to
A for one. The rest of the argument goes through just as before. Our
If x 6 z 0 and y 6 z 0 then z 6 z 0
(it’s the least of the upper bounds). If we write the z here as x ∨ y, and
if we utilise the transitivity of 6, we could write x 6 x ∨ y as “if v 6 x
then v 6 x ∨ z.” Our rules then take the form
v6x v6y x6u y6u
v6x∨y v6x∨y x∨y6u
which should look rather familiar. If we think of entailment as an or-
dering among pieces of information (or propositions, or what-have-
you), then disjunction forms a least upper bound on that ordering.
Clearly the same sort of thing could be said for conjunction. Conjunc-
tion is a greatest lower bound:
x6v y6v u6x u6y
x∧y6v x∧y6v u6x∧y
Ordered structures in which every pair of elements has a greatest lower
bound (or meet) and least upper bound (or join) are called lattices.
definition 2.5.13 [lattice] An ordered set hP, 6, ∧, ∨i with operators
∧ and ∨ is said to be a lattice iff for each x, y ∈ P, x ∧ y is the greatest
lower bound of x and y (with respect to the ordering 6) and x ∨ y is
the least upper bound of x and y (with respect to the ordering 6).
Consider the two structures below. The one on the left is not a
lattice, but the one on the right is a lattice.
a c a b c
b d f
On the left, b and d have no lower bound at all (nothing is below both
of them), and while they have an upper bound (both a and c are upper
bounds of b and d) they do not have a least upper bound. On the other
hand, every pair of objects in the structure on the right has a meet and
a join. They are listed in the tables below:
∧ f a b c t ∨ f a b c t
f f f f f f f f a b c t
a f a f f a a a a t t t
b f f b f b b b t b t t
c f f f c c c c t t c t
t f a b c t t t t t t t
Notice that in this lattice, the distribution law fails in the following
way:
a ∧ (b ∨ c) = a ∧ t = a 66 c = f ∨ c = (a ∧ b) ∨ c
Lattices stand to our logic of conjunction and disjunction in the same
sort of way that truth tables stand to traditional classical propositional
logic. Given a lattice hP, 6, ∧, ∨i we can define a valuation v on for-
mula in the standard way.
Proof: The proof takes two parts, “if” and “only if.” For “only if” we
need to ensure that if A ` B has a proof, then for any valuation on any
lattice, v(A) 6 v(B). For this, we proceed by induction the construction
of the derivation of A ` B. If the proof is simply the axiom of identity,
then A ` B is p ` p, and v(p) 6 v(p). Now suppose that the proof is
more complicated, and that the hypothesis holds for the prior steps in
the proof. We inspect the rules one-by-one. Consider ∧L: from A ` R
to A ∧ B ` R. If we have a proof of A ` R, we know that v(A) 6 v(R).
We also know that v(A ∧ B) = v(A) ∧ v(B) 6 v(A) (since ∧ is a lower
bound), so v(A ∧ B) 6 v(R) as desired. Similarly for ∨R: from L ` A
to L ` A ∨ B. If we have a proof of L ` A we know that v(L) ` v(A).
This is one example of the way that we can think of our logic in an
algebraic manner. We will see many more later. Before going on to
the next section, let us use a little of what we have seen in order to
reflect more deeply on cut and identity.
the-right is “less” true than ]-on-the-right, but it will not work if the
mismatch is in the other direction.
2.5.4 | exercises
propositional logic:
applications
3
As promised, we can move from technical details to applications of
these results. With rather a lot of proof theory under our collective
belts, we can turn our attention to philosophical issues. In this chapter,
we will look at questions such as these: How are we to understand
the distinctive necessity of logical deduction? What is the distinctively
logical ‘must’? In what way are logical rules to be thought of as defini-
tions? What can we say about the epistemology of logical truths? Can
there be genuine disagreement between rival logical theories, or are all
such discussions a dialogue of in which the participants talk different
languages and talk past one another?
In later chapters we will examine other topics, such as generality,
predication, objectivity, modality, and truth. For those, we require a
little more logical sophistication than we have covered to this point.
What we have done so far suffices to equip us to tackle the topics at
hand.
The techniques and systems of logic can be used for many different
things — we can design electronic circuits using simple boolean lo-
gic [97]. We can control washing machines with fuzzy logic [45]. We
can use substructural logics to understand syntax [16, 54, 55] — but
beyond any of those interesting applications, we can use the techniques
of logic to construct arguments, to evaluate them, and to tell us some-
thing about how beliefs, conjectures, theories and statements, fit to-
gether. It is this role of logic that is our topic.
129
Suppose that A entails B, that there is a proof of B from A. What
can we say about assertions of A and of B? If an agent accepts A, then
it is tempting to say that the agent also ought to accept B, because
B follows from A. But this is far too strong a requirement to take
seriously. Let’s consider why not:
(1) The requirement, as I have expressed it, has many counterexamples.
The requirement has the following form:
Notice that I have a proof from A to A. (It is a very small proof: the
identity proof.) It would follow, if I took this requirement seriously,
that if I accept A, then I ought to accept A. But there are many things
– presumably – that I accept that I ought not accept. My beliefs extend
beyond my entitled beliefs. The mere fact that I believe A does not in
and of itself, give me an entitlement, let alone, an obligation to believe
A. So, the requirement that you ought to accept the consequences of
your beliefs is altogether too strong.
This error in the requirement is corrected with a straightforward
scope distinction. Instead of saying that if A entails B and if you accept
A then you ought to accept B, we should perhaps weaken the condition
as follows:
by holding that those claims to which I am committed are all and only
the consequences of those things I accept. There are good reasons to I will slightly revise this notion later,
think of commitment in this way. However, it remains that for the but it will do for now.
consideration one: Parents of small children are aware that the abil-
ity to refuse, deny and reject arrives very early in life. Considering
whether or not something is the case – whether to accept that some-
thing is the case or to reject it – at least appears to be an ability children
acquire quite readily. At face value, it seems that the ability to assert
and to deny, to say yes or no to simple questions, arrives earlier than
any ability the child has to form sentences featuring negation as an op-
erator. It is one thing to consider whether or not A is the case, and it is
another to take the negation ¬A as a further item for consideration and
reflection, to be combined with others, or to be supposed, questioned,
addressed or refuted in its own right. The case of early development
lends credence to the claim that the ability to deny can occur prior to
the ability to form negations. If this is the case, the denial of A, in the
mouth of a child, is perhaps best not analysed as the assertion of ¬A.
So, we might say that denial may be acquisitionally prior to neg-
ation. One can acquire the ability to deny before the ability to form
negations.
identity: [A : A] is incoherent.
A position consisting of the solitary assertion of A (whatever claim A
might be) together with its denial, is incoherent.
To grasp the import of calling a position incoherent, it is vital to neither
understate it, nor to overstate it. First, we should not overstate the
claim by taking incoherent positions to be impossible. While it might
be very difficult for someone to sincerely assert and deny the same
statement in the same breath, it is by no means impossible. For ex-
ample, if we wish to refute a claim, we may proceed by means of a
reductio ad absurdum by asserting (under an assumption) that claim,
deriving others from it, and perhaps leading on to the denial of some-
thing we have already asserted. Once we find ourselves in this posi-
tion (including the assertion and the denial of the one and the same
claim) we withdraw the supposition. We may have good reasons to put
ourselves in incoherent positions, in order to manage the assertions
and denials we wish to make. To call a position incoherent is not to say
that the combination of assertions and denials cannot be made.
Conversely, it is important to not understate the claim of incoher-
ence. To call the position [X : Y] incoherent is not merely to say that it
is irrational to assert X and deny Y , or that it is some kind of bad idea.
It is much more than that. Consider the case of the position [A : A].
This position is seriously self-defeating in that to take you to assert
A is to take you to rule out denials of A (pending a retraction of that
assertion), to take you to deny A is to take you to rule out assertions of
A (pending a retraction of that denial). The incoherence in the position
is due to the connection between assertion and denial, in that to make
the one is to preclude the other. The incoherence is not simply due to
any external feature of the content of that assertion. As a matter of
X ` A, Y X, A ` Y
¬L ¬R
X, ¬A ` Y X ` ¬A, Y
With conjunction, asserting each member of X is equivalent to assert- Choose you own favourite way
ing X, the conjunction of each member of X. Denying each member of finding a conjunction of each
V
member of a finite set. For an n-
of Y is W
equivalent to denying Y . So, we have X ` Y if and only if
W
membered set there are at least n!
X ` Y , if and only if X ∧ ¬ Y `, if and only if ` ¬ X ∨ Y .
V V W V W
was of doing this.
For each position [X : Y] we have a complex statment X ∧ ¬ Y
V W
into triviality. (This is not to deny that Roy Cook’s results about logics
in which tonk rules define a connective are not interesting [18]. How-
ever, they have little to do with consequence relations as defined here,
as they rely on a definition of logical consequence that is essentially
not transitive—that is, they do not satisfy strengthening.)
Now consider a much more interesting case of nonconservative exten-
sion. Suppose that our language contains a negationlike connective ∼
satisfying the following rules
X`A X, A `
∼L ∼R
X, ∼A ` X ` ∼A
A`A
∼L
∼A, A `
∼R
∼A ` ∼A
A`A
¬L
A, ¬A ` A`A
∼R ¬R
¬A ` ∼A ` A, ¬A
∼L ¬L
∼∼A, ¬A ` ¬¬A ` A
¬R
∼∼A ` ¬¬A
` B, C X, B, A `
Cut
X, A ` C
∼L??
X ` ∼A, C
3.4 | meaning
The idea that the rules of inference confer meaning on the logical con-
nectives is a compelling one. What can we say about this idea? We
have shown that we can introduce the logical constants into a practice
of asserting and denial in such a way that assertions and denials fea-
turing those connectives can be governed for coherence along with the
original assertions and denials. We do not have to say anything more
about the circumstances in which a negation or a conjunction or a dis-
junction is true or false, or to find any thing to which the connectives
‘correspond.’ What does this tell us about the meanings of the items
that we have introduced?
rules and use: The account of the connectives given here, as operat-
ors introduced by means of well-behaved rules concerning coherence,
gives a clear connection to the way that these connectives are used. To
be sure, it is not a descriptive account of the use of the connectives.
Instead, it is a normative account of the use of the connectives, giving
us a particular normative category (coherence) with which to judge the
practice of assertions and denials in that vocabulary. The rules for the
connectives are intimately connected to use in this way. If giving an
account of the meaning of a fragment of our vocabulary involves giv-
ing a normative account of the proprieties of the use of that vocabulary,
the rules for the connectives can at the very least be viewed as a part
of the larger story of the meaning of that vocabulary.
A, B ` A ∧ B A∧B`A A∧B`B
These do not say that a conjunction is true if and only if the conjuncts
are both true, but they come very close to doing so, given that we are
not yet using the truth predicate. These rules tell us that it is never
permissible to assert both conjuncts, and to deny the conjunction. This
is not expressed in terms of truth conditions, but for many purposes it
will have the same consequences. Furthermore, using the completeness
Well, I need to write that bit up. proofs of Section ???, they tell us that there is no model in which A and
B are satisfied and A ∧ B is not; and that there is no model in which
A ∧ B is satisfied, and in which A or B is not. If we think of satisfaction
in a model as a model for truth, then the truth-conditional account
of the meaning of connectives is a consequence of the rules. We may
think of a the reification or idealisation of a coherent position (as given
to us in the completeness proofs) as a model for what is true. We do
not need to reject truth-conditional accounts of the meanings of the
connectives. They are consequences of the definitions that we have
given. Whether or not we take this reading of the truth-conditional
account as satisfying or not will depend, of course, on what we need
the concept of truth to do. We will examine this in more detail later.
rules and translation: What can we say about the connection between
the rules for connectives and how we interpret the assertions and deni-
als of others? Here are some elementary truisms: (a) people do not
have to endorse the rules I use for negation for me to take them to
mean negation by ‘not.’ It does not seem that we settle every question
of translation by looking at these rules. Nonethless, (b) we can use
these rules as a way of making meaning more precise. We can clarify
meaning by proposing and adopting rules for connectives. Not every
use of ‘and’ fits the rules for ‘∧’. Adopting a precisely delineated co-
herence relation for an item aids communication, when it is achievable.
The rules we have seen are a very good way to be precise about the
behaviour of the connectives. (c) Most cruically, what we say about
translation depends on the status of claims of coherence. If I take what
someone says to be assertoric, I relate what they say and what I am
committed to in the one coherence relation. I keep score by keeping
track of what (by my lights) you are saying. And you do this for me.
I use a coherence relation to keep score of your judgements, and you
do the same for me. There can be disputes over the relation: you can
take some position to be coherent that I do not. This is made explicit
by our logical vocabulary. If you and I agree about the rules for clas-
sical connectives, then if we disagree over whether or not X, A ` B, Y
is coherent, then we disagree (in the context [X : Y]) over the coher-
ence of asserting A → B. Connectives are a way of articulating this
disagreement. Similarly, ¬ is a way of making explicit incompatibility
judgements of the form A, B ` or exhaustiveness judgements of the
form ` A, B.
3.6 | warrant
Discussion of warrant preservation in proof. In what way is classical
inference apt for preservation of warrant, and what is the sense in
which intuitionstic logic is appropriate. [Preservation warrant in the
case of an argument from X to A. These are verificationally transparent
arguments. Preservation of diswarrant in the case of an argument
from B to Y . These are the falsificationally transparent arguments. Both
are nice. But neither is enough.]
3.8 | realism
Discussion of the status of models and truth this account. Are we real-
ists? Are these merely useful tools, or something more? (Discussion
of Blackburn’s quasi-realism and its difficulties with logic here. The
Frege/Geach problem is discussed at this point, if not before.)
149
[september 18, 2006]
quantifiers:
tools & techniques
4
4.1 | predicate logic
4.1.1 | rules
4.1.2 | what do the rules mean?
Mark Lance, makes the point in his paper “Quantification, Substitu-
tion, and Conceptual Content” [48] that an inference-first account of
quantifiers is sensible, but it isn’t the kind of “substitutional” account
oft mentioned. If we take the meaning of (∀x)A to be given by what
one might infer from it, and use to infer to it, then one can infer from
it to each of its instances (and furthermore, to any other instances we
might get as we might further add to the language). What one might
use to infer to (∀x)A is not just each of the instances A[x/n1 ], A[x/n2 ],
etc. (Though that might work in some restricted circumstances, clearly
it is unwieldy at best and wrong-headed at worst.) The idea behind
the rules is that to infer to (∀x)A you need not just an instance (or
all instances). You need something else. You need to have derived an
instance in a general way. That is, you need to have a derivation of
A[x/n] that applies independently of any information “about n.” (It
would be nice to make some comment about how this sidesteps all of
the talk about “general facts” in arguments about what you need to
know to know that (∀x)A apart from knowing that A[x/n] for each par-
ticular n, but to make that case I would need more space and time than
I have at hand at present.)
[Xa]i
φ(a) a=b ·
·
=E ·
φ(b) Xb
=I,i
a=b
151
rules.
[Xa]1 a=b
=E
Xb b=c
=E
Xc
=I,1
a=c
The sequent rules are these:
Γ ` φ(a), ∆ Γ 0 , φ(b) ` ∆ 0 Γ, Xa ` Xb, ∆
=L =R
Γ, Γ 0 , a = b ` ∆, ∆ 0 Γ ` a = b, ∆
The side condition in =R is that X does not appear in Γ or ∆.
Notice that we need to modify the subformula property further,
since the predicate variable does not appear in the conclusion of =R,
and more severely, the predicate φ does not appear in the conclusion
of =L Here is an example derivation:
Xa ` Xa Xb ` Xb
¬R ¬L
` ¬Xa, Xa Xb, ¬Xb `
=L
a = b, Xb ` Xa
=R
a=b`b=a
If we have a cut on a formula a = b which is active in both premises of
that rule:
· · 0 · 00
· δX ·δ ·δ
· · ·
Γ, Xa ` Xb, ∆ Γ ` φ(a), ∆ 0
0
Γ , φ(b) ` ∆ 00
00
=R =L
Γ ` a = b, ∆ Γ 0 , Γ 00 , a = b ` ∆ 0 , ∆ 00
Cut
Γ, Γ 0 , Γ 00 ` ∆, ∆ 0 , ∆ 00
we can eliminate it in favour of two cuts on the formulas φ(a) and φ(b).
To do this, we modify the derivation δX to conclude Γ, φ(a) ` φ(b), ∆,
which we can by globally replacing Xx by φ(x). The result is still a
derivation. We call it δφ . Then we may reason as follows:
· · 0
· δφ ·δ
· ·
Γ, φ(a) ` φ(b), ∆ Γ ` φ(a), ∆ 0
0 · 00
·δ
Cut ·
Γ, Γ 0 ` φ(b), ∆, ∆ 0 Γ , φ(b) ` ∆ 00
00
Cut
Γ, Γ 0 , Γ 00 ` ∆, ∆ 0 , ∆ 00
4.3 | models
Traditional Tarski models for classical logic and Kripke intuitionistic
logic motivated on the basis of the proof rules we have introduced. A
presentation of more “modest” finitist semantics in which the domain
is finite at each stage of evaluation, given by the sequent system. A
context of evaluation, in this kind of model, is a finite entity, including
information about “how to go on.”
4.4 | arithmetic
Peano and Heyting arithmetic are introduced as a simple example of
a rigorous system with enough complexity to be truly interesting. Dis-
cussion of the consistency proof for arithmetic. I will point to Gödel’s
incompleteness results, and show how pa + Con(pa) can be seen as
adding to the stock of arithmetic inference principles, in the same way
that adding stronger induction principles does. [32]
quantifiers: applications
5
5.1 | objectivity
Substitutional and objectual quantification and objectivity. The ac-
count of quantification given here isn’t first-and-foremost objectual in
the usual sense, but it can be seen as a semantically anti-realist (that is,
not truth-first) reading of standard, objectual quantification. A defence
of this analysis, and the a discussion of the sense in which this provides
properly universal quantification, independently of any consideration
of whether the class of “everything” is a set or can constitute a model.
5.2 | explanation
How do we prove a universal claim? By deriving it. Explanation of
the reasons why people like “universal facts” and why this is better
understood in terms prior to commitment to fact-like entities.
5.3 | relativity
A discussion of ontological relativity, Quine’s criterion for commit-
ment to objects. (We discuss the sense in which logic alone does
not force the existence of any number of things, and why the choice
of ontology depends on the behaviour of names and variables in your
theory.)
5.4 | existence
A discussion of a neo-Carnapian view that to adopt inference prin-
ciples concerning numbers, say, is free. Relating to current discussion
of structuralism, plenitudinous platonism and fictionalism in mathem-
atics
5.5 | consistency
The essential incompleteness and extendibility of our inference prin-
ciples.
155
the range of the quantifiers in second order logic as an ideal endpoint
of conceptual expansion.) A discussion of why standard Second Or-
der Logic, so construed, is essentially non-axiomatisable.
157
[september 18, 2006]
159
[september 18, 2006]
161
[september 18, 2006]
references
[1] alan ross anderson and nuel d. belnap. Entailment: The Logic This bibliography is also available
of Relevance and Necessity, volume 1. Princeton University Press, online at http://citeulike.org/
Princeton, 1975. user/greg_restall/tag/ptp.
citeulike.org is an interesting
collaborative site for sharing pointers
[2] alan ross anderson, nuel d. belnap, and j. michael dunn. Entail-
to the academic literature.
ment: The Logic of Relevance and Necessity, volume 2. Princeton
University Press, Princeton, 1992.
[6] nuel d. belnap. “Tonk, Plonk and Plink”. Analysis, 22:130–134, 1962.
[14] a. carbone. “Interpolants, Cut Elimination and Flow graphs for the
Propositional Calculus”. Annals of Pure and Applied Logic, 83:249–
299, 1997.
163
[15] a. carbone. “Duplication of directed graphs and exponential blow up of
proofs”. Annals of Pure and Applied Logic, 100:1–67, 1999.
[20] dirk van dalen. “Intuitionistic Logic”. In dov m. gabbay and franz
günthner, editors, Handbook of Philosophical Logic, volume III.
Reidel, Dordrecht, 1986.
[30] kit fine. “Vaguness, Truth and Logic”. Synthese, 30:265–300, 1975.
Reprinted in Vagueness: A Reader [46].
164 references
[september 18, 2006]
[37] jean-yves girard, yves lafont, and paul taylor. Proofs and Types,
volume 7 of Cambridge Tracts in Theoretical Computer Science. Cam-
bridge University Press, 1989.
[41] jean van heijenoort. From Frege to Gödel: a a source book in math-
ematical logic, 1879–1931. Harvard University Press, Cambridge,
Mass., 1967.
[45] rolf isermann. “On Fuzzy Logic Applications for Automatic Control,
Supervision, and Fault Diagnosis”. IEEE Transactions on Systems,
Man, and Cybernetics—Part A: Systems and Humans, 28:221–235,
1998.
165
[50] paola mancosu. From Brouwer to Hilbert. Oxford University Press,
1998.
[53] peter milne. “Classical Harmony: rules of inference and the meaning
of the logical constants”. Synthese, 100:49–94, 1994.
[56] sara negri and jan von plato. Structural Proof Theory. Cambridge
University Press, Cambridge, 2001. With an appendix by Aarne Ranta.
[60] terence parsons. “Assertion, Denial, and the Liar Paradox”. Journal
of Philosophical Logic, 13:137–152, 1984.
[65] dag prawitz. “Proofs and the Meaning and Completeness of the Logical
Constants”. In e. saarinen j. hintikka, i. niiniluoto, editor, Essays
on Mathematical and Philosophical Logic, pages 25–40. D. Reidel,
1979.
166 references
[september 18, 2006]
[69] graham priest, richard sylvan, and jean norman, editors. Paracon-
sistent Logic: Essays on the Inconsistent. Philosophia Verlag, 1989.
[74] greg restall. “Deviant Logic and the Paradoxes of Self Reference”.
Philosophical Studies, 70:279–303, 1993.
[80] edmund robinson. “Proof Nets for Classical Logic”. Journal of Logic
and Computation, 13(5):777–797, 2003.
167
[87] neil tennant. Natural Logic. Edinburgh University Press, Edinburgh,
1978.
[89] neil tennant. The Taming of the True. Clarendon Press, Oxford,
1997.
168 references