2015 Book TuringSRevolution
2015 Book TuringSRevolution
2015 Book TuringSRevolution
Thomas Strahm
Turing’s
Editors
Revolution
The Impact of His Ideas about
Computability
Turing’s Revolution
Giovanni Sommaruga • Thomas Strahm
Editors
Turing’s Revolution
The Impact of His Ideas about Computability
Editors
Giovanni Sommaruga Thomas Strahm
Department of Humanities, Social Institute of Computer Science
and Political Sciences and Applied Mathematics
ETH Zurich University of Bern
Zurich, Switzerland Bern, Switzerland
June 23, 2012 was Alan Turing’s 100th birthday. In the months preceding and
following this date, there were widespread events commemorating his life and
work. Those of us who had been writing about his work received more invitations
to participate than we could accommodate. Speakers at one of the many Turing
conferences were often invited to contribute to an associated volume of essays. This
book developed from such a meeting, sponsored by the Swiss Logic Society, in
Zurich in October 2012. The table of contents hints at the breadth of developments
arising from Turing’s insights. However, there was little public understanding of this
at the time of Turing’s tragic early death in 1954.
Thinking back on my own 65 years of involvement with Turing’s contributions,
I see my own dawning appreciation of the full significance of work I had been
viewing in a narrow technical way, as part of a gradual shift in the understanding
of the crucial role of his contributions by the general educated public. In 1951,
learning to program the new ORDVAC computer shortly after I had taught a course
on computability theory based on Turing machines, it became clear to me that the
craft of making computer programs was the same as that of constructing Turing
machines. But I was still far from realizing how deep the connection is. In the
preface to my 1958 Computability & Unsolvability, I wrote:
The existence of universal Turing machines confirms the belief . . . that it is possible to
construct a single “all purpose” digital computer on which can be programmed (subject of
course to limitations of time and memory capacity) any problem that could be programmed
for any conceivable deterministic digital computer.. . . I was very pleased when [it was]
suggested the book be included in [the] Information Processing and Computers Series.
My next publication that commented explicitly on Turing’s work and its implica-
tions was for a collection of essays on various aspects of current mathematical work
that appeared in 1978. My essay [3] began:
. . . during the Second World War . . . the British were able to systematically decipher [the
German] secret codes. These codes were based on a special machine, the “Enigma” . . . Alan
Turing had designed a special machine for . . . decoding messages enciphered using the
Enigma.
v
vi Preface
. . . around 1936 [Turing gave] a cogent and complete logical analysis of the notion of
“computation.” . . . [This] led to the conclusion that it should be possible to construct
“universal” computers which could be programmed to carry out any possible computation.
The absolute secrecy surrounding information about the remarkable British effort
to break the German codes had evidently been at least partially lifted. It began
to be widely realized that Turing had made an important contribution to the war
effort. For myself, I had come to understand that the relationship between Turing’s
mathematical construction that he called a “universal machine” and the actual
computers being built in the postwar years was much more than a suggestive
analogy. Rather it was the purely mathematical abstraction that suggested that,
given appropriate underlying technology, an “all-purpose” computing device could
be built.
The pieces began to come together when the paper [1], published in 1977,
explained that Turing had been involved in a serious effort to build a working gen-
eral purpose stored-program computer, his Automatic Computing Engine (ACE).
Andrew Hodges’s wonderful biography (1983) [5] filled in some of the gaps. Finally
Turing’s remarkably prescient view of the future role of computers became clear
when, in [2], Turing’s actual design for the ACE as well as the text of an address
he delivered to the London Mathematical Society on the future role of digital
computers were published. It was in this new heady atmosphere that I wrote my
essay [4] in which I insisted that it was Turing’s abstract universal machine, which
itself had developed in the context of decades of theoretical research in mathe-
matical logic, that had provided the underlying model for the all-purpose stored
program digital computer. This was very much against the views then dominant
in publications about the history of computers, at least in the USA. Meanwhile,
noting that 1986 marked the 50th anniversary of Turing’s universal machine, Rolf
Herken organized a collection of essays in commemoration. There were 28 essays
by mathematicians, physicists, computer scientists, and philosophers, reflecting the
breadth and significance of Turing’s work.
June 28, 2004, 5 days after Turing’s 90th birthday, was Turing Day in Lausanne.
Turing Day was celebrated with a conference at the Swiss Federal Institute of
Technology organized by Christof Teuscher, at the time still a graduate student
there. There were nine speakers at this one-day conference with over 200 attendees.
Teuscher also edited the follow-up volume of essays [7] to which there were 25
contributors from remarkably diverse fields of interest. Of note were two essays
on generally neglected aspects of Turing’s work. One, written by Teuscher himself,
was about Turing’s work on neural nets written in 1948. The other recalled Turing’s
work during the final month’s of his life on mathematical biology, particularly
on morphogenesis: Turing had provided a model showing how patterns such as
the markings on a cow’s hide could be produced by the interactions of two
chemicals. A more bizarre topic, discussed in a number of the essays, which had
developed at the time under the name hypercomputation, sought ways to compute
the uncomputable, in effect proposing to carry out infinitely many actions in a finite
time. This was particularly ironic because computer scientists were finding that mere
Preface vii
Turing computability was insufficient for practical computation and were seeking to
classify algorithms according to criteria for feasibility.
As Turing’s centenary approached, there was a crescendo of events bringing
Turing’s importance to the attention of the public. An apology by the British
government for his barbaric mistreatment along with a commemorative postage
stamp was a belated attempt to rectify the unrectifiable. I myself spoke about
Turing at various places in the USA and Britain and also in Mexico, Peru, Belgium,
Switzerland, and Italy, crossing and recrossing the Atlantic Ocean several times
in a few months’ time. A high point was attending a banquet at King’s College,
Cambridge (Turing’s college), with members of Turing’s family present, celebrating
his actual 100th birthday. I’m very fond of Zurich and was delighted to speak
at the conference there from which this volume derives. The perhaps inevitable
consequence of Alan Turing’s developing fame was Hollywood providing in The
Imitation Game its utterly distorted version of his life and work.
Considering the authors whose contributions the editors have managed to gather
for this volume, I feel honored to be included among them. When it turned out
that my good friend Wilfried Sieg and I were writing on very similar subjects,
we agreed to join forces, providing for me a very enjoyable experience. Among
the varied contributions, I particularly enjoyed reading the essay by Jack Copeland
and Giovanni Sommaruga on Zuse’s early work on computers. I had been properly
chastised for omitting mention of Zuse in my The Universal Computer, and I did
add a paragraph about him in the updated version of the book for Turing’s centenary.
However, I had never really studied his work and I have learned a good deal about it
from this excellent essay. The writings of Sol Feferman can always be depended on
to show deep and careful thought. I very much enjoyed hearing Stewart Shapiro’s
talk at the conference in Zurich, and I look forward to reading his essay. Barry
Cooper has, of course, been a driving force in the 2012 celebrations of Alan Turing
and in the “Computability in Europe” movement. Finally, I want to thank Giovanni
Sommaruga and Thomas Strahm for bringing together these authors and editing this
appealing volume.
References
1. B.E. Carpenter, R.W. Doran, The other Turing machine. Comput. J. 20, 269–279 (1977)
2. B.E. Carpenter, R.W. Doran (eds.), A.M. Turing’s ACE Report of 1946 and Other Papers (MIT
Press, New York, 1986)
3. M. Davis, What is a computation, in Mathematics Today: Twelve Informal Essays, ed. by L.A.
Steen (Springer, New York, 1978), pp. 241–267
4. M. Davis, Mathematical logic and the origin of modern computers, in Studies in the History
of Mathematics (Mathematical Association of America, Washington D.C., 1987), pp. 137–165.
Reprinted in [6], pp. 149–174
viii Preface
5. A. Hodges, Alan Turing, The Enigma (Simon and Schuster, New York, 1983)
6. R. Herken (ed.), The Universal Turing Machine - A Half-Century Survey (Verlag Kemmerer &
Unverzagt/Oxford University Press, Hamburg, Berlin/Oxford, 1988)
7. C. Teuscher (ed.), Alan Turing: Life and Legacy of a Great Thinker (Springer, Berlin, 2004)
Introduction
During the beginning and about half of Turing’s short life as a scientist and
philosopher, the notions of computation and computability dazzled the mind of a
lot of the most brilliant logicians and mathematicians. Some of the most prominent
figures of this group of people are Alonzo Church, Kurt Gödel, Jacques Herbrand,
Stephen Kleene, and Emil Post—and of course Alan Turing. The second quarter
of the twentieth century was such a melting pot of ideas, conceptions, theses, and
theories that it is worthwhile to come back to it and to dive into it over and over
again. That’s what this volume’s first part is about.
What is the point of looking back, of turning back to the past? What is the good
of looking at the history, e.g., of the notion of computability? The look backwards
serves to better understand the present. One gets to understand the origin, the core
ideas, or notions at the beginning of a whole development. The systematical value
of the historical perspective may also lie in the insight that there is more to the past
than being the origin of the present, that there is a potential there for alternative
developments, that things could have developed differently, and even that things
could still develop differently. The past and especially a look at the past may thereby
become a source of inspiration for new developments. Looking at the past may
contribute to planning the future, if one starts to wonder: Where do we come from?
Where are we going to? And where do we want to go?
In the second half of the last century and up to this very day, these ideas,
conceptions, etc. and in particular Turing’s gave rise to a plethora of new logical
and mathematical theories, fields of research, some of them extensions or variations,
and others novel applications of Turing’s conceptions and theories. This volume’s
second part aims at presenting a survey of a considerable part of subsequent
and contemporary logical and mathematical research influenced and sparked off,
directly or indirectly, by Turing’s logical mathematical ideas, ideas concerning
computability and computation.
These generalizations can take on different shapes and go in different directions.
It is possible to generalize concepts, theses, or theories of Turing’s (e.g., gener-
alizing his thesis to other structures, or his machine concept to also account for
the infinite). But it is equally possible to take Turing’s fulcrum and by varying it
ix
x Introduction
1
The quotations in this and the following sections are referring to the article which is summarized
in the resp. section.
xii Introduction
methodologies are required in computer science in order to study the rich landscape
of computer systems.
After presenting a biographical sketch of Konrad Zuse, Copeland and Som-
maruga “outline the early history of the stored-program concept in the UK and the
US,” and “compare and contrast Turing’s and John von Neumann’s contributions
to the development of the concept.” They go on to argue that, contrary to the recent
prominent suggestions, the stored-program concept played a key role in computing’s
pioneering days, and they provide a logical analysis of the concept, distinguishing
four different layers (or “onion skins”) comprising the concept. This layered model
allows them to classify the contributions made by Turing, von Neumann, Eckert
and Mauchly, Clippinger, and others, and especially Zuse. Furthermore, Copeland
and Sommaruga discuss whether Zuse developed a universal computer (as he
himself claimed) or rather a general-purpose computer. In their concluding remarks,
Copeland and Sommaruga “reprise the main events in the stored-program concept’s
early history.” The history they relate begins in 1936, the date of publication of
Turing’s famous article “On Computable Numbers,” and also of Zuse’s first patent
application for a calculating machine, and runs through to 1948, when the first
electronic stored-program computer—the Manchester “Baby”—began working.
Feferman starts with a detailed summary and discussion of versions of the
Church-Turing Thesis (CT) on concrete structures given by sets of finite symbolic
configurations. He addresses the works of Gandy, Sieg as well as Dershovitz and
Gurevich, which all have in common that they “proceed by isolating basic properties
of the informal notion of effective calculability or computation in axiomatic form
and proving that any function computed according to those axioms is Turing com-
putable.” Feferman continues to note that generalizations of CT to abstract structures
must be considered as theses for algorithms. He starts by reviewing Friedman’s
approach for a general theory of computation on first order structures and the work
of Tucker and Zucker on “While” schemata over abstract algebras. Special emphasis
is put on the structure of the real numbers; in this connection, Feferman also
discusses the Blum, Shub, and Smale model of computation. A notable proposal of a
generalization of CT to abstract structures is the Tucker–Zucker thesis for algebraic
computability. The general notion of algorithm it uses leads to the fundamental
question “What is an algorithm?,” which has been addressed under this title by
Moschovakis and Gurevich in very different ways. Towards the end of his article,
Feferman deals with a specific approach to generalized computability, which has its
roots in Platek’s thesis and uses a form of least fixed point recursion on abstract
structures. Feferman concludes with a sensible proposal, the so-called “Recursion
thesis,” saying that recursion on abstract first order structures (with Booleans {T,F})
belongs to the class he calls Abstract Recursion Procedures, ARP (formerly Abstract
Computation Procedures, ACP).
Tucker and Zucker survey their work over the last few decades on generalizing
computability theory to various forms of abstract algebras. They start with the
fundamental distinction between abstract and concrete computability theory and
emphasize their working principle that “any computability theory should be focused
equally on the data types and the algorithms.” The first fundamental notion is
Introduction xiii
the one of While computation on standard many sorted algebras, which is a high
level imperative programming language applied to many sorted signatures. After
a precise syntactical and semantical account of this language, Tucker and Zucker
address notions of universality. An interesting question is to consider various
possible definitions of semi-computability in the proposed setting. As it turns out,
different characterizations, extensionally equivalent in basic computability theory
over the natural numbers, are different in the Tucker and Zucker setting. A further
emphasis in the paper is laid on data types with continuous operations like the reals,
which leads to a general theory of many sorted partial topological algebras. Tucker
and Zucker conclude by comparing their models with related abstract models of
computation and by proposing various versions of a generalized Church–Turing
thesis for algebraic computability.
Welch gives a broad survey of models of infinitary computation, many of which
are rooted in infinite time Turing machines (ITTMs). It is the latter model that
has sparked a renewed interest in generalized computability in the last decade.
After explaining the crucial notion of “computation in the limit,” various important
properties of ITTMs are reviewed, in particular, their relationship to Kleene’s higher
type recursion. Furthermore, Welch elaborates on degree theory and the complexity
of ITTM computations as well as on a close relationship between ITTMs, Burgess’
quasi-inductive definitions, and the revision theory of truth. He then considers
variants of the ITTM model that have longer tapes than the standard model.
Afterwards Welch turns to transfinite generalizations of register machines as devised
by Sheperdson and Sturgis, resulting in infinite time register machines (ITRMs) and
ordinal register machines (ORMs); the latter model also has registers for ordinal
values. The last models he explains are possible transfinite versions of the Blum–
Shub–Smale machine, the so-called IBSSMs. Welch concludes by mentioning the
extensional equivalence (on omega strings) of continuous IBSSMs, polynomial time
ITTMs, and the safe recursive set functions due to Beckmann, Buss, and Friedman.
Gurevich reviews various semantics-to-syntax analyses of the so-called species
of algorithms, i.e., particular classes of algorithms given by semantical constraints.
In order to find a syntactic definition of such a species, e.g., a machine model, one
often needs a fulcrum, i.e., a particular viewpoint to narrow down the definition of
the particular species in order to make a computational analysis possible. Gurevich
starts with Turing’s fundamental analysis of sequential algorithms performed by
idealized human computers. According to Gurevich, Turing’s fulcrum was to
“ignore what a human computer has in mind and concentrate on what the computer
does and what the observable behavior of the computer is.” Next, Gurevich turns
to the analysis of digital algorithms by Kolmogorov in terms of Kolmogorov–
Uspenski machines and identifies its fulcrum, namely that computation is thought
of as “a physical process developing in space and time.” Then Gurevich discusses
Gandy’s analysis of computation by discrete, deterministic mechanical devices and
identifies its fulcrum in Gandy’s Principle I, according to which the representation
and working of mechanical devices must be expressible in the framework of
hereditarily finite sets. The fourth example discussed is the author’s own analysis
of the species of sequential algorithms using abstract state machines. Its fulcrum
xiv Introduction
is: “Every sequential algorithm has its native level of abstraction. On that level, the
states can be faithfully represented by first-order structures of fixed vocabulary in
such a way that the transitions can be expressed naturally in the language of that
fixed vocabulary.”
Turing (implicitly) introduced the notion of Turing reducibility in 1939 by
making use of oracle Turing machines. This preorder induces an equivalence
relation on the continuum, which identifies reals with the same information content
and whose equivalence classes are the so-called Turing degrees. Barmpalias and
Lewis survey order-theoretic properties of degrees of typical reals, whereby sensible
notions of typicality are derived from measure and category. Barmpalias and Lewis
present a detailed history of measure and category arguments in the Turing degrees
as well as recent results in this area of research. Their main purpose is to provide
“an explicit proposal for a systematic analysis of the order theoretically definable
properties satisfied by the typical Turing degree.” Barmpalias and Lewis identify
three very basic questions which remain open, namely: (1) Are the random degrees
dense? (2) What is the measure of minimal covers? (3) Which null classes of degrees
have null upward closure?
Beklemishev’s paper relates to Turing’s just mentioned paper entitled “Systems
of logic based on ordinals,” which is very well known for its highly influential
concepts of the oracle Turing machine and of relative computability. The main
body of Turing’s 1939 paper, however, belongs to a different area of logic, namely
proof theory. It deals with transfinite recursive progressions of theories in order
to overcome Gödelian incompleteness. Turing obtained a completeness result for
˘2 statements by iterating the local reflection principle, whereas Feferman in
1962 established completeness for arbitrary arithmetic statements by iteration of
the uniform reflection principle. Turing’s and Feferman’s results have the serious
drawback that the ordinal logics are not invariant under the choice of ordinal repre-
sentations and their completeness results depend on artificial such representations.
Beklemishev’s approach is to sacrifice completeness in favor of natural choices of
ordinals. He obtains a proof-theoretic analysis of the most prominent fragments
of first order arithmetic by using the so-called smooth progressions of iterated
reflection principles.
Juraj Hromkovic’s main claim is that to properly understand Alan Turing’s
contribution to science, one ought to understand what mathematics is and what
the role of mathematics in science is. Hromkovic answers these latter questions
by comparing mathematics with a new, somewhat artificial language: “one creates a
vocabulary, word by word, and uses this vocabulary to study objects, relationships”
and whatever is accessible to the language of mathematics at a certain stage of
its development. For Leibniz, mathematics offered an instrument for automatizing
the intellectual work of humans. One expresses part of reality in the language of
mathematics or in a mathematical model, and then one calculates by means of
arithmetic. The result of this calculation is again a truth about the investigated part of
reality. Leibniz’ “dream was to achieve the same for reasoning.” He was striving for
a formal system of reasoning, a formal system of logic, analogous to arithmetic, the
formal system for the calculation with numbers. Hilbert’s dream of the perfection
Introduction xv
provable do not thereby mean provable by a ZFC-proof, nor could they possibly
mean provable by a formal proof. Thus, he reaches the conclusion that even though
CT is not entirely a formal or ZFC matter, this doesn’t preclude it from being
any less mathematical or from being provable. In order to defend this line of
argument, he opposes the foundationalist model of mathematical knowledge to a
holistic one. According to the latter model, it is possible to explain why Sieg’s proof
of CT might justifiably be called so, although it is neither a purely formal nor a
ZFC proof. It can be called a proof because it is formally sound and because its
premises are sufficiently evident. This according to Shapiro doesn’t say that the
proof may not beg any questions. But he concludes that at least “Sieg’s discourse
and, for that matter, Turing’s are no more question-begging than any other deductive
argumentation. The present theme is more to highlight the holistic elements that go
into the choice of premises, both in deductions generally and in discourses that are
rightly called “proofs,” at least in certain intellectual contexts.”
Soare sets off explaining what Turing’s Thesis (TT) and what Gandy’s Thesis M
is. After characterizing the concept of a thesis as opposed to that of a statement of
facts or a theorem, Soare considers the question whether Turing proved his assertion
(that a function on the integers is effectively calculable iff it is computable by a
so-called Turing machine) “beyond any reasonable doubt or whether it is merely
a thesis, in need of continual verification.” In his sketchy presentation of Turing’s
1936 paper, Soare points out that in Turing’s analysis of the notion of a mechanical
procedure, Turing broke up the steps of such a procedure into the smallest steps,
which could not be further subdivided. And when going through Turing’s analysis,
one is left with something very close to a Turing machine designed to carry out those
elementary steps. Soare then relates on Gödel’s, Church’s, and Kleene’s reaction to
Turing’s 1936 paper. In 1936, Post independently formulated an assertion analogous
to Turing’s, and he called it a working hypothesis. Somewhat in the same vein,
Kleene in 1943 and especially in 1952 called Turing’s assertion a thesis, which in
the sequel led to the standard usage of “Turing’s Thesis.” Already in 1937, Church
objected to Post’s working hypothesis and in 1980 and 1988 Gandy challenged
Kleene’s claim that Turing’s assertion is a thesis (i.e., could not be proved). Soare
emphasizes that not only Gandy, but later on also Sieg, Dershowitz, and Gurevich
as well as Kripke have presented proofs of Turing’s assertion. Since Soare endorses
those proofs, he gets to the conclusion that “Turing’s Thesis” TT shouldn’t be called
that way any longer, it should rather be called Turing’s Theorem.
Barry Cooper’s aim is “to make clearer the relationship between the typing of
information—a framework basic to all of Turing’s work—and the computability
theoretic character of emergent structures in the real universe.” Cooper starts off
with the question where incomputability comes from. He notes that there is no
notion of incomputability without an underlying model of computation, which is
here provided by the classical Turing machine model. He then observes that whereas
from a logical point of view, a Turing machine is fairly simple, any embodiment of
a particular universal Turing machine as an actual machine is highly non-trivial.
“The physical complexities have been packaged in a logical structure, digitally
coded,” and “the logical view has reduced the type of the information embodied in
Introduction xvii
the computer.” Cooper first considers the relationship between incomputability and
randomness. The notion of randomness used to describe the lack of predictability
is sometimes taken to be more intuitive than the one of incomputability. Cooper
argues that randomness turns out to be a far more complicated notion, and that
all that could be substantiated “building on generally accepted assumptions about
physics, was incomputability.” He then proceeds from the observation that “global
patterns can often be clearly observed as emergent patterns in nature and in social
environments, with hard to identify global connection to the underlying computa-
tional causal context.” Cooper discusses this gap by reflecting on the relationship
between the halting problem (one of the classical paradigms of incomputability)
and the Mandelbrot set. For him, the Mandelbrot set “provides an illuminating link
between the pure abstraction of the halting problem, and the strikingly embodied
examples of emergence in nature.” Cooper generalizes his observations about this
link by introducing the mathematics of definability. The higher properties of a
structure (those important to understand in the real world) are the large-scale,
emergent relations of the structure, and the connection between these and their
underlying local structure is mathematically formalized in terms of definability.
“Such definability can be viewed as computation over higher type data.” However,
as Cooper explains, “computation over higher-type information cannot be expected
to have the reliability or precision of the classical model.” And he uses the example
of the human brain and its hosting of complex mentality to illustrate this. The
mathematics to be used for this new type of computation is based on the theory of
degrees of incomputability (the so-called Turing degrees) and the characterization
of the Turing definable relations over the structure of the Turing degrees.
We have to close this introduction with the very sad news that Barry Cooper, the
author of the last article of this volume, after a brief illness passed away on October
26, 2015. Barry has been the driving force behind the Turing Centenary celebrations
in 2012 and of the Computability in Europe movement since 2005. His impact and
work as a supporter of Alan Turing are ubiquitous. We dedicate our volume to Barry.
His gentle and enthusiastic personality will be greatly missed.
Acknowledgements
This is our opportunity to thank those people who made a special contribution to this
volume or who contributed to making this a special volume. We’d like to express our
great gratitude to Martin Davis who graciously took upon himself the task of writing
a wonderful preface. We’d also like to express this gratitude to Jack Copeland who
contributed a couple of brilliant ideas the specific nature of which remains a secret
between him and us. Moreover, we are very grateful to the referees of all the articles
for their refereeing job, and especially to those who did a particularly thorough job.
And we’d finally like to thank Dr. Barbara Hellriegel, Clemens Heine and Katherina
Steinmetz from Birkhäuser/Springer Basel, and Venkatachalam Anand from SPi
xviii Introduction
Content Solutions/SPi Global very much for not despairing about the delay with
which we delivered the goods and for making up for this delay with a very efficient
and very friendly high speed production process.
xix
xx Contents
4 Gandy.. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197
4.1 Gandy’s Species of Algorithms .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 197
4.2 Gandy’s Fulcrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 199
4.3 Comments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 199
5 Sequential Algorithms .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 200
5.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 200
5.2 The Species .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 201
5.3 The Fulcrum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 203
6 Final Remarks .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 204
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 205
The Information Content of Typical Reals . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 207
George Barmpalias and Andy Lewis-Pye
1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 207
1.1 The Algorithmic View of the Continuum . . . . . . .. . . . . . . . . . . . . . . . . . . . 208
1.2 Properties of Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 209
1.3 A History of Measure and Category Arguments
in the Turing Degrees . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 210
1.4 Overview.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 212
2 Typical Degrees and Calibration of Typicality . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 213
2.1 Large Sets and Typical Reals . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 213
2.2 Properties of Degrees and Definability .. . . . . . . . .. . . . . . . . . . . . . . . . . . . . 215
2.3 Very Basic Questions Remain Open.. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 216
3 Properties of the Typical Degrees and Their Predecessors .. . . . . . . . . . . . . . . . 217
3.1 Some Properties of the Typical Degrees . . . . . . . .. . . . . . . . . . . . . . . . . . . . 217
3.2 Properties of Typical Degrees are Inherited
by the Non-zero Degrees They Compute . . . . . . .. . . . . . . . . . . . . . . . . . . . 219
4 Genericity and Randomness .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 221
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 223
Proof Theoretic Analysis by Iterated Reflection . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 225
L.D. Beklemishev
1 Preliminary Notes .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 226
2 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 229
3 Constructing Iterated Reflection Principles .. . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 235
4 Iterated …2 -Reflection and the Fast Growing Hierarchy .. . . . . . . . . . . . . . . . . . 241
5 Uniform Reflection Is Not Much Stronger Than Local Reflection . . . . . . . . 244
6 Extending Conservation Results to Iterated Reflection Principles . . . . . . . . 249
7 Schmerl’s Formula .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 253
8 Ordinal Analysis of Fragments .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 255
9 Conclusion and Further Work . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 259
Appendix 1. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 260
Appendix 2. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 262
Appendix 3. . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 265
References . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . 269
Contents xxiii
Abstract In 1936, Post and Turing independently proposed two models of com-
putation that are virtually identical. Turing refers back to these models in his (The
word problem in semi-groups with cancellation. Ann. Math. 52, 491–505) and calls
them “the logical computing machines introduced by Post and the author”. The
virtual identity is not to be viewed as a surprising coincidence, but rather as a
natural consequence of the way in which Post and Turing conceived of the steps
in mechanical procedures on finite strings. To support our view of the underlying
conceptual confluence, we discuss the two 1936 papers, but explore also Post’s work
in the 1920s and Turing’s paper (Solvable and unsolvable problems. Sci. News 31,
7–23). In addition, we consider their overlapping mathematical work on the word-
problem for semigroups (with cancellation) in Post’s (Recursive unsolvability of a
problem of Thue. J. Symb. Log. 12, 1–11) and Turing’s (The word problem in semi-
groups with cancellation. Ann. Math. 52, 491–505). We argue that the unity of their
approach is of deep significance for the theory of computability.
M. Davis ()
Department of Computer Science, Courant Institute of Mathematical Sciences, New York
University, 3360 Dwight Way, Berkeley, CA 94704-2523, USA
e-mail: martin@eipye.com
W. Sieg
Department of Philosophy, Carnegie Mellon University, 5000 Forbes Avenue, Pittsburgh, PA
15213, USA
1 Introduction
Princeton was an exciting place for logicians in the mid-1930s. Church had been
appointed as assistant professor of mathematics in 1929 and, together with his
students Stephen C. Kleene and J. Barkley Rosser, started to create a new subject
soon to be called “recursive function theory”.1 Their work was propelled by von
Neumann’s presentation of Gödel’s incompleteness theorems [27] in the fall of
1931 and Gödel’s lectures “On undecidable propositions of formal mathematical
systems” in the spring of 1934 [28]. Paul Bernays, the collaborator of David Hilbert
on proof theory and the related foundational program, visited for the academic year
1935/1936 and gave the lectures [2]. Even before that visit, Bernays and Church
had extensive correspondence concerning the ongoing work in recursion theory; see
[59]. From September 1936 to July 1938, Alan Turing was studying in Princeton as
Church’s Ph.D. student; his dissertation “Systems of logic based on ordinals” was
published in [73].
In 1936, Church founded a new quarterly devoted to contemporary research in
logic: The Journal of Symbolic Logic. Its first volume contained a three-page paper
by Emil L. Post, “Finite combinatory processes: Formulation I”. The editors added
a footnote to the paper that read:
Received October 7, 1936. The reader should compare an article by A.M. Turing, On
computable numbers, shortly forthcoming in Proceedings of the London Mathematical
Society. The present article, however, although bearing a later date, was written entirely
independently of Turing’s.
Post was at the time teaching at City College in New York City and had also contact
with Church, both personal and through correspondence.
As to the remark concerning Post’s paper, it would indeed be readily apparent
to any reader of Turing’s and Post’s articles that their basic idea of “computation”
was the same. Turing presented a machine model; his machines were “supplied
with a ‘tape’ ... divided into sections, called ‘squares’, each capable of bearing a
‘symbol’ ” from a finite list, possibly just two different ones. In the latter case,
we refer to the machine as a two-letter-machine.2 Instead of a tape divided into
squares, Post wrote of a “symbol space ... to consist of a two way infinite sequence
of spaces or boxes”; each box could be either “marked” or “unmarked”. In both
conceptions, at a given instant, one particular square/box was in play, and the
permitted basic operations were to change the symbol in the square/box and/or
to move to an adjacent square/box. Turing imagined these steps carried out by a
1
The broader history of computability has been described in a number of publications, partly by
the participants in the early development in Princeton, for example, [43, 57]. Good discussions are
found in, among others [9, 14, 26, 58, 62, 68].
There are many excellent books on recursion/computability theory, but not many that take
Post’s approach as fundamental. We just mention [13, 47, 56, 67].
2
Machines that are deterministic are Turing’s a-machines; if a-machines operate only on 0s and 1s
they are called computing machines; see [71], p. 232.
Conceptual Confluence in 1936: Post and Turing 5
simple mechanism with a finite number of states where each step included a possible
change of state. In Post’s formulation, the computation is carried out by a human
“worker” who is following a fixed sequence of instructions including what we would
call a conditional jump instruction. The jump instruction is also part of a program for
Turing’s machine: depending on the symbol in the square and its state, the machine
may follow a different next instruction.
It seems astonishing that, on opposite sides of the Atlantic, two models of
computation were proposed that are virtually identical. We suggest that this virtual
identity is not to be viewed as a surprising coincidence, but rather as a natural
consequence of the way in which Post and Turing conceived of steps in mechanical
procedures on finite strings. To support our view of the underlying conceptual
confluence, we discuss the two 1936 papers, but explore also Post’s work in
the 1920s and Turing’s paper [75]. We consider in addition their overlapping
mathematical work on the word problem for semigroups [54] and for semigroups
with cancellation [74]. We argue that the unity of their approaches is of deep
significance for the theory of computability.3
3
There are other commonalities in their approaches, e.g., in connection with relative computability,
which first appeared in Turing’s dissertation and which played such a key role in Post’s later work.
However, here we are focusing on their fundamental conceptual analysis.
6 M. Davis and W. Sieg
computation Post and Turing introduced in 1936 are essentially the same. Here is
Turing’s retrospective assertion from the beginning of his [74], p. 491:
The method [of proof] depends on reducing the unsolvability of the problem in question to
a known unsolvable problem connected with the logical computing machines introduced by
Post [49] and the author [71].
Turing points, in a dramatic way, to the structural identity of the computations of his
two-letter machine and those of the worker in [49].
Post formulated no intrinsic reason for the model he presented in this paper, but
he conjectured it to be equivalent to the “Gödel-Church development”. If one con-
siders also his work on canonical systems from the 1920s (discussed below), then
there is an indication of a reason: the model may be viewed as a playful description
of a simple production system to which canonical systems can be reduced. Turing’s
unambiguous substitution puzzles include these simple production systems. In order
to specify such puzzles one is given an unlimited supply of “counters”, possibly of
only two distinct kinds. A finite sequence of counters is an initial configuration,
and the puzzle task is to transform the given configuration into another one using
substitutions from a fixed finite list of rules. Such a puzzle, though not by this
name, is obtained in Sect. 9,I of Turing’s [71] at the very end of his analysis of
mechanical procedures; it can be carried out by a suitably generalized machine that
operates on strings, a string machine, as presented in Sect. 5. A good example of a
substitution puzzle, Turing asserts in [75], is “the task of proving a mathematical
theorem within an axiomatic system”. The abstract investigation of just this task for
parts of Whitehead and Russell’s Principia Mathematica was the starting point of
Post’s work in the 1920s, as we discuss in Sect. 4.
The “theorem-proving-puzzle” for first-order logic was not only Post’s problem,
but was also one of the central issues in mathematical logic during the 1920s: is
it decidable by a mechanical procedure whether a particular given statement can
be proved from some assumptions? Commenting on this problem, best known as
Hilbert’s Entscheidungsproblem, von Neumann conjectured in 1926 that it must
have a negative solution and added, “we have no idea how to prove this”. Ten years
later Turing showed that the problem has indeed a negative solution, after having
addressed first the key conceptual issue, namely, to give a precise explication of
“mechanical procedure”.4 The explication by his machine model of computation is
grounded in the analysis that leads, as mentioned, to a substitution puzzle. In parallel
to what we said above about Post’s 1936 model, Turing’s two-letter machine may be
viewed as a playful formulation of a machine to which string machines can provably
be reduced and to which, in turn, the mechanical processes of a human computing
agent can be reduced, if these processes (on strings or other concrete symbolic
configurations) are constrained by finiteness and locality conditions; see Sect. 5.
4
In the same year, Church established the unsolvability of the Entscheidungsproblem—having
identified -definability and general recursiveness with effective calculability; [6, 7].
Conceptual Confluence in 1936: Post and Turing 7
The conceptual confluence of Turing’s work with Post’s is rather stunning: Post’s
worker model and Turing’s two-letter machine characterize exactly the same class
of computations. The crucial link are the simple substitution puzzles (with just
two counters) that are uniquely connected to each instance of both models. The
wider confluence indicated above is discussed in detail in Sects. 4 and 5; its very
special character is seen perhaps even more vividly when comparing it with the
contemporaneous attempts to analyze directly the effective calculability of number
theoretic functions.
5
In the early evolution of recursion theory, Gödel’s definition was viewed as being a modification
of a proposal of Herbrand’s—because Gödel presented it that way in his Princeton Lectures. In a
letter to Jean van Heijenoort in 1964, Gödel reasserted that Herbrand had suggested, in a letter, a
definition very close to the one actually presented in [28]. However, the connection of Gödel’s
definition to Herbrand’s work is much less direct; that is clear from the two letters that were
exchanged between Gödel and Herbrand in 1931. John Dawson found the letters in the Gödel
Nachlass in 1986; see [17]. The letters are published in [36]; their intellectual context is discussed
in [61].
8 M. Davis and W. Sieg
Thesis is, even today, mostly supported by two kinds of reasons. There is, first of all,
quasi-empirical evidence through the fact that in roughly 80 years of investigating
effectively calculable functions we have not found a single one that is not recursive;
that is supported further by the practical experience of hundreds of thousands of
computer programmers for whom it is a mundane matter of everyday experience that
even extremely complex algorithms can be carried out by an appropriate sequence
of very basic operations. There is, secondly, the provable equivalence of different
notions, which was already important for Church in 1935, having just proved with
Kleene the equivalence of recursiveness and -definability. This is the “argument
by confluence”, highlighted in [26], and brought out clearly already by Church in
footnote 3 of Church [6]:
The fact, however, that two such widely different and (in the opinion of the author) equally
natural definitions of effective calculability turn out to be equivalent adds to the strength of
the reason adduced below for believing that they constitute as general a characterization of
this notion as is consistent with the usual intuitive understanding of it.
The farther apart the sharply defined notions are, the more strongly does their
equivalence support the claim of having obtained a most general characterization
of the informal notion. In contrast, the direct conceptual confluence of Turing’s and
Post’s work emphasizes the unity of their analyses of mechanical procedures.
The above two central reasons for supporting Church’s Thesis, quasi-empirical
evidence and the argument by confluence, have been complemented by sustained
arguments that attempt to analyze the effective calculability of number theoretic
functions. All the early attempts, from 1936 to 1946, do that in terms of a single core
concept, namely, calculation in a logic or determination of the value of a function in
a calculus or deductive system (in essence generalizing Gödel’s equation calculus).
Church, in Sect. 9 of his classical [6], used the “step-by-step argument” to “prove”
the thesis. Calculations of function values for particular arguments are to be carried
out in a logic and may use only “elementary” steps. These elementary steps are then
taken to be recursive; thus, with subtle circularity, the claim has been established.
Let us call the identification of “elementary” step with “recursive” step Church’s
Central Thesis. The subtle circularity is brought into the open by Hilbert and
Bernays in Supplement II of the book [39]: they explicitly formulate recursiveness
conditions for “deductive formalisms”, define the concept of a reckonable function
(regelrecht auswertbare Funktion) and then show that the functions calculable in
a deductive formalism that satisfies the recursiveness conditions are exactly the
general recursive functions.6 (In fact, Hilbert and Bernays formulate primitive
recursiveness conditions.)
6
How far removed the considerations of Turing (and Post) were from those of Bernays, who
actually wrote Supplement II of Hilbert and Bernays [39], should be clear from two facts: (i)
In a letter to Church of 22 April 1937, Bernays judged Turing as “very talented” and his concept
of computability as “very suggestive”; (ii) Bernays did not mention Turing’s paper in Supplement
II, though he knew Turing’s work very well. Indeed, in his letter to Church Bernays pointed out a
few errors in [71]; Church communicated the errors to Turing and, in a letter of 22 May 1937 to
Bernays, Turing acknowledged them and suggested that he would write a Correction, [72]!
Conceptual Confluence in 1936: Post and Turing 9
Church and Hilbert & Bernays used a form of argument that underlies the
proof of Kleene’s Normal Form Theorem. This same form of argument also easily
establishes Gödel’s observation in the “Remark added in proof” to his short note
[29]; namely, functions that are calculable in systems of higher-order arithmetic
(even of transfinite order) are already calculable in elementary arithmetic. This
absoluteness phenomenon allowed him to think for the first time, as he mentions
in a letter to Kreisel of 1 May 1968, that his concept of general recursiveness
is, after all, adequate to characterize computability and thus formal theories in
full generality. Ten years after the 1936 absoluteness remark Gödel asserted in
his contribution to the Princeton Bicentennial [31] that kind of absoluteness for
any formal theory extending number theory; the extending theories can now also
be systems of axiomatic set theory. As in Church’s step-by-step argument, the
absoluteness cannot be proved, unless the formality of the extending formal theory
is articulated in a rigorous way, for example by requiring that the proof predicate be
(primitive) recursive.7
Sometime in the late 1930s, Gödel wrote the manuscript [30], apparently notes
for a possible lecture. He formulated a beautifully simplified equation calculus
whose postulates have two characteristics: (I) Each of them is an equation between
terms (built up in the standard way from variables, numerals, and function symbols),
and (II) “. . . the recursive postulates . . . allow [one] to calculate the values of the
function defined”. For the calculation one needs only two rules; (R1) replaces
variables by numerals, and (R2) replaces a term by a term that has been shown
to be equal (to the first term). One gets to the two rules “by analyzing in which
manner this calculation [according to (II)] proceeds”. The two characteristics of
the postulates are, Gödel claims, “exactly those that give the correct definition of
a computable function”. Gödel then goes on to define when a number theoretic
function f is computable: “. . . if there exists a finite number of admissible postulates
in f and perhaps some auxiliary functions g1 ; : : : ; gn such that every true elementary
equation for f can be derived from these postulates by a finite number of applications
of the rules R1 and R2 and no false elementary equation for f can thus be derived.”
In short, Gödel views his analysis of calculating the value of a function as ultimately
leading to the correct definition of a computable function.
It seems quite clear that Gödel’s considerations sketched in the last paragraph
are a full endorsement of Church’s Thesis for the concept of general recursive
functions, here called computable functions. At the beginning of Gödel [30],
pp. 166–167, Gödel had emphasized the importance of characterizing “uniform
mechanical procedures” on finite expressions and finite classes of expressions8 ; now
7
In a very informative letter Church wrote on 8 June 1937 to the Polish logician Pepis, the
absoluteness of general recursive functions is indirectly argued for. (Church’s letter and its analysis
is found in [59].) Gödel’s claim, with “formality” sharpened in the way we indicated, is an almost
immediate consequence of the considerations in Supplement II of Hilbert and Bernays [39].
8
Gödel argues at the bottom of p. 166 that the expressions and finite classes of expressions can
be mapped to integers (“Gödel numbering”). Thus, he asserts, a “procedure in the sense we want
is nothing else but a function f .x1 ; : : : ; xr / whose arguments as well as its values are integers and
10 M. Davis and W. Sieg
The story has been told frequently, especially during the centenary year of 2012,
of how Turing learned from Max Newman’s lectures in Cambridge that although
it was widely believed that there was no algorithm for provability in first-order
logic, no one had proved this, and how he went on to develop his concept of
which is such that for any system of integers n1 ; : : : ; nr the value can actually be calculated”.
So a “satisfactory definition of calculable functions” is needed, and that’s what the definition of
computable function yields for Gödel.
9
In [10], p. 10 and [69], p. 214, Gödel’s remark “That this really is the correct definition of
mechanical [our emphasis] computability was established beyond any doubt by Turing.” is taken
as showing that computability of functions is defined here by reference to Turing machines, i.e.,
that Gödel at this point had taken already Turing’s perspective. That view can be sustained only, if
the context we sketched is left completely out of consideration.
Conceptual Confluence in 1936: Post and Turing 11
computability as a key step in demonstrating that this is indeed the case. The story
of how Post came to write [49] begins in the early 1920s and is not nearly so
widely known.10 After completing his undergraduate studies at City College in New
York, he began graduate work at Columbia University, where he participated in a
seminar on Whitehead and Russell’s Principia Mathematica (PM). The seminar was
directed by Cassius J. Keyser (Post’s eventual thesis advisor) and was presumably
devoted to studying the proofs within PM. Post decided for his doctoral dissertation
to study PM from the outside, proving theorems as he wrote, that are “about the
logic of propositions but are not included therein”. This was a conception totally
foreign to Whitehead and Russell, although it was close to what Hilbert was to call
“metamathematics” and which he had practiced already with great success in his
Foundations of Geometry of 1899, [38].
Post began with the first part of PM, i.e., [81], the subsystem we now call
the “propositional calculus”. Post proved that its derivable propositions were
precisely the tautologies, thus showing that the propositional calculus was complete
and algorithmically decidable. Independently, Bernays had already proved, in his
1918 Göttingen Habilitationsschrift [1], the completeness and decidability of the
propositional logic of PM.11 Next Post wanted to get similar results for first-order
logic (as formulated in PM, Sects. 10 and 11/. Thus he began an attack on what
amounted to the same Entscheidungsproblem that was Turing’s starting point. But
working in 1921 Post could still think of finding a decision algorithm. His main
idea was to hide the messy complexities of quantificational logic by seeing it as a
special case of a more general type of combinatorial structure, and then to study
the decision problem for such structures in their full generality. (Post had been
influenced by C.I. Lewis in viewing the system of PM from a purely formal point of
view as a “combinatorial structure”; see Post’s reference in footnote 3 of Post [48]
to [45], Chap. VI, Sect. III.)
Already in his dissertation, Post had introduced such a generalization that he now
made the starting point of his investigation. He called this formulation Canonical
Form A and proved that first-order logic could be expressed in a related form that
he called Canonical Form B. He also showed that Canonical Form A was reducible
to Canonical Form B in the sense that a decision procedure for the latter would lead
to one for the former. However, it was a third formulation Post called Canonical
Form C that has survived and plays a key role in our discussion. This formulation
is defined in terms of so-called Post canonical productions which we now explain.
10
As to Post’s biography, see [15].
The mathematical and philosophical part of Post’s contributions is discussed with great clarity
in [26], pp. 92–98. The unity of their approaches is, however, not recognized; it is symptomatic
that neither [75] nor the overlapping work in [54, 74] is even mentioned.
A very comprehensive and illuminating account of Post’s work is found in [79].
11
His work was published belatedly and only partially in [3]. In addition to the completeness
question, Bernays also investigated the independence of the axioms of PM. He discovered that
the associativity of disjunction is actually provable from the remaining axioms (that were then
shown to be independent of each other).
12 M. Davis and W. Sieg
Let † be a given finite alphabet. As usual write † for the set of finite strings on †.
A canonical production on † has the form:
Here the g’s are given strings on the alphabet †, the P’s are variables over strings,
and each of the P’s in the line following the + also occurs as one of the P’s above the
+. A system S in Canonical Form C consists of a finite set of strings on †, initial
assertions, together with a finite set of canonical productions. Iteratively applying
the productions to the initial assertions and to the strings so obtained etc., a subset
of † is obtained; Post called this subset the set generated by the system S. The
decision problem for a system S in Canonical Form C was to determine of a given
s 2 † whether it is in the set generated by S. Post was able to prove that the
decision problem for Canonical Form B is reducible to that for Canonical Form C.
Thus, to solve the decision problem for first-order logic, it remained to solve the
decision problem for systems in Canonical Form C. As a step in that direction, Post
showed how to reduce this decision problem to that for normal systems, a special
kind of system in Canonical Form C with only one initial assertion and productions
of the simple kind
gP ) PNg
Q \ † :
UDU
Finally, “closing the circle”, Post showed that the decision problem for normal
systems is reducible to that for systems in Canonical Form A, thus showing that
the decision problems for Forms A, B, and C are all equivalent.
In 1941 Post submitted an article to the American Journal of Mathematics in
which he told the story of his work two decades earlier on unsolvability. The
article was rejected as being inappropriate for a journal devoted to current research.
However, the editor noted that the work on reducing canonical systems to normal
form was new and suggested that Post resubmit just that part. Post followed that
Conceptual Confluence in 1936: Post and Turing 13
suggestion and the paper [51] was the result. The longer paper was eventually
published in [12] when Post was no longer alive.
The story has been told elsewhere of how Post exhausted himself trying to
solve the decision problem for a special kind of normal system, the so-called tag
systems.12 The intractability of this problem that Post had thought of as merely
an initial step in his project apparently played a significant role in reversing his
thinking [20]. Imagining an enumeration of all normal systems and associating with
each such system the set of all strings on a distinguished symbol a, Post saw that
he could diagonalize, i.e., form the set which contains the string aa : : : a with n
occurrence of a just in case that string is not generated by the nth normal system;
clearly this set can not be the set of strings on the letter a generated by any normal
system. On the other hand his whole development argued for the full generating
power of normal systems: beginning with the reducibility of first-order logic to
Canonical Form B which Post believed could be extended to all of PM, taking
into account the power of PM to encapsulate ordinary mathematical reasoning, and
finally noting the reductions of Canonical Form B to Canonical Form C and then
to normal systems. Since the enumeration of normal systems could easily be made
quite explicit, it appeared that in apparent contradiction to the above, the set obtained
by diagonalization was a set generated by a straightforward process.
Post saw that the problem lay in the tacit assumption that there was a process
for determining whether a given string is generated by a given normal system. To
escape this dilemma he had to either give up on the generality of normal systems
or accept the conclusion that there is no such process. He chose the latter. Since the
properties of normal systems could readily be formalized within PM, this led to the
heady conclusion that there is no decision procedure for PM itself. Finally, if PM
were complete with respect to propositions asserting that a given string is generated
by a given normal system, then by using PM to generate all propositions that assert
or deny that some particular string is generated by some particular normal system, a
decision procedure for normal systems would be obtained. It followed that PM was
incomplete, even for such propositions. Thus this work at least partially anticipated
what Gödel, Church, and Turing were to accomplish a decade later.13
The very conclusion of the incompleteness of PM argued against accepting its
capabilities in an argument for the generality of the generating power of normal
systems. Post concluded that “. . . a complete analysis would have to be made of
all the possible ways in which the human mind could set up finite processes for
12
Tag systems may be characterized as normal systems in which for its productions gP ) PNg:
1. All of the gs are of the same length;
2. the gN s corresponding to a given g depend only on its initial symbol;
3. if a given g occurs on the left in one of the productions, then so do all other strings of the same
length having the same initial symbol as that g.
Post discusses Tag in the introductory section of Post [51] and in Sect. 3 of Post [50].
13
Davis [15], DeMol [20], DeMol [21], and Post [50].
14 M. Davis and W. Sieg
Turing emphasizes in [71] at the very outset, in Sect. 1 and referring to Sect. 9,
that he is concerned with mechanical operations on symbolic configurations—
carried out by humans. Indeed, he uses computer to refer to human computing
agents who proceed mechanically; his machines, our Turing machines, are referred
to as machines! Gandy suggested calling a computer in Turing’s sense computor
and a computing machine, as we actually do, computer. In Sect. 9 of Turing [71],
computors operate on symbolic configurations that have been written on paper; the
paper is divided into squares “like a child’s arithmetic book”. However, the two-
dimensional character of the paper is not viewed to be an “essential of computation”,
and the one-dimensional tape divided into squares is taken, without any argument,
as the basic computing space.
Striving then to isolate computor-operations that are “so elementary that it is
not easy to imagine them further divided”, Turing formulates a crucial requirement:
symbolic configurations relevant for a computor’s actions have to be recognized
immediately or at a glance. Because of the reductive step from a two-dimensional
14
Post [50]: p. 408 in [12]; p. 387 in [16]; p. 422 in [55].
15
Gödel [36] pp. 169, 171.
The notes that were exchanged between Gödel and Post are in this volume of Gödel’s Collected
Works.
Conceptual Confluence in 1936: Post and Turing 15
grid to a linear tape, one has to be concerned only with immediately recognizing
sequences of symbols. Turing appeals now to a crude empirical fact concerning
human sensory capacities: it is impossible for a computor to determine at a
glance whether 9889995496789998769 is identical with 98899954967899998769.
This sensory limitation of computors leads directly to boundedness and locality
conditions: (B) there is a bound on the number of symbol sequences a computor can
recognize at a glance, and (L) the operations of a computor must locally modify a
recognized configuration.16
Given that the analysis of a computor’s steps leads to these restrictive conditions,
it is evident that Turing machines operating on strings, string machines, simu-
late computors. Indeed, Turing having completed the analysis of the computor’s
calculation, asserts, “We may now construct a machine to do the work of this
computer [i.e., computor].” The machine that is constructed is a string machine.
Thus, the general connection of Turing with Post is clear: one just has to notice that
(deterministic) string machines are (unambiguous) substitution puzzles, and that the
latter are a species of Post’s production systems! With respect to string machines
Turing remarks,
The machines just described [string machines] do not differ very essentially from computing
machines as described in §B, and corresponding to any machine of this type a computing
machine can be constructed to compute the same sequence, that is to say the sequence
computed by the computer [i.e., computor].
16
We neglect in our discussion “states of mind” of the computor. Here is the reason why. Turing
argues in Sect. 9.I that the number of these states is bounded, and Post calls this Turing’s “finite
number of mental states” assumption. However, in Sect. 9,III Turing argues that the computor’s
state of mind [“mental state”] can be replaced in favor of “a more physical and definite counterpart
of it”. In a sense then, the essential components of a computation have been fully externalized; see
[13], p. 6, how this is accomplished through the concept of an “instantaneous description”.
16 M. Davis and W. Sieg
Turing’s attitude is certainly much less definite than is Gödel’s view in [32], where
Turing’s work is seen as giving “a precise and unquestionably adequate definition
of the general concept of formal system”; such an adequate definition is provided,
as Turing presents “an analysis of the concept of ‘mechanical procedure . . . . This
concept is shown to be equivalent with that of a ‘Turing machine’.” In the footnote
attached to this remark Gödel suggests consulting not only Turing’s 1936 paper but
also “the almost simultaneous paper by E.L. Post (1936)”.
Post’s perspective on the openness of the concept “finite combinatory process”,
contrary to Gödel’s, is strikingly indicated in the paper Gödel recommended. Post
envisions there a research program that considers wider and wider formulations
of such processes and has the goal of logically reducing them to formulation 1.
Clearly, that is in the spirit of the investigations he had pursued in the early 1920s.
(What other symbolic configurations and processes he had in mind is discussed
at the beginning of our last section, entitled Concluding Remarks.) Post expresses
in the 1936 paper his expectation that formulation 1 will “turn out to be logically
Conceptual Confluence in 1936: Post and Turing 17
6 Word Problems
Alonzo Church was struck by the short paper [53] in which Post used the
unsolvability of the decision problem for normal systems to prove the unsolvability
of a kind of string matching problem that Post had called the Correspondence
17
In footnote 8, p. 105 of Post [49], he criticizes masking the identification of recursiveness and
effective calculability under a definition as Church had done. This, Post continues, “hides the fact
that a fundamental discovery in the limitations of the mathematizing power of Homo Sapiens has
been made and blinds us to the need of its continual verification.”
18
Post pointed this out in a number of different places: (1) Urquhart on p. 643 of Urquhart [79]
quotes from Post’s 1938 notebook and discusses, with great sensitivity, “an internal reason for
Post’s failure to stake his claim to the incompleteness and undecidability results in time” on
pp. 630–633; (2) in [50], p. 377, Post refers to the last paragraph of [49] and writes: “However,
should Turing’s finite number of mental states hypothesis ... bear under adverse criticism, and an
equally persuasive analysis be found for all humanly possible modes of symbolization, then the
writer’s position, while still tenable in an absolute sense, would become largely academic.”
18 M. Davis and W. Sieg
for strings on an alphabet †. Post called such a set of productions a Thue system.
Each such pair of productions enable the substitution in a given string of an
occurrence of gN for g or vice versa. For u; v 2 † we write u v to indicate
that v can by obtained from u by a finite number of such substitutions; this is clearly
an equivalence relation. Thue sought an algorithm that would determine for a given
Thue system and a given pair of strings u; v 2 † whether u v. In [54] it is
proved that there is no such algorithm.21
Rather than beginning with the unsolvability of a problem concerning normal
systems, as he had done in [53], Post made use of Turing machines to deal with
Thue’s problem. In Post’s formulation, a Turing machine was defined in terms of
a finite number of symbols S0 ; S1 ; : : : ; Sm and a finite number of states or internal
configurations q1 ; : : : ; qn . The machine was to act on a “tape” consisting of a linear
two-way-infinite array of cells or squares each capable of holding a single symbol.
In each of the successive operations of the machine it is in one of the given states
and a single cell is distinguished as the scanned cell. The behavior of the machine
is controlled by a finite number of quadruples each of which is of one of the three
types:
qi Sj Sk q` qi Sj Rq` qi Sj Lq`
Each quadruple specifies what the machine will do when the machine is in state qi
and the scanned cell contains the symbol Sj . For a quadruple of the first type the
action is to replace Sj by Sk in the scanned cell. For a quadruple of the second type it
is for the machine to move to the cell immediately to the right of the current one, the
new scanned square, while the tape contents remains unchanged. For a quadruple of
the third type, the new scanned square, similarly, will be the one to the left. In all
19
This problem later turned out to be a useful tool for obtaining unsolvability theorems regarding
Noam Chomsky’s hierarchy of formal languages, which by the way, was itself based quite
explicitly on Post production systems. See also the extended discussion of the correspondence
problem in [79], p. 648.
20
It is interesting that Markov’s proof [46] of the unsolvability of Thue’s problem, which was quite
independent of Post’s, did use normal systems.
21
Of course this is to be understood in relation to the Church-Turing Thesis.
Conceptual Confluence in 1936: Post and Turing 19
three cases, the new state of the machine is to be q` . The deterministic behavior of
the machine is enforced by requiring that no two of the quadruples are permitted to
begin with the same pair qi Sj .22 The machine is to halt when it arrives at a state qi
scanning a symbol Sj for which no quadruple beginning qi Sj is present.
The machine is to begin in state q1 with all cells containing the symbol S0 ,
thought of as a blank. Hence at all stages in the machine’s computation, all but
a finite number of the cells will still contain S0 , as shown below:
In the diagram, the cell containing Sj is intended to represent the square currently
scanned in the course of a machine computation. To represent the tape contents by
a finite string, Post introduced a special symbol h to serve as beginning and end
markers delimiting a region of the tape beyond which, in both directions, all cells
contain S0 . To represent a situation in which the tape is as depicted and the machine
is in state qi , Post used the string:
The indicated initial situation can thus be represented by the string hq1 S0 h, and a
machine’s computation can be represented as a sequence of finite strings of this
form, showing the situation at its successive stages. Post provided productions
corresponding to each of the quadruples whose indicated action the machine is
carrying out and that have the effect of producing the transition from one term of
this sequence to the next. Thus, corresponding to each quadruple of the machine of
the form qi Sj Sk q` , Post used the corresponding production
Pqi Sj Q ) Pq` Sk Q
The quadruples that call for the machine to move to the left or the right require
special productions to deal with the marker h and to lengthen the string to include
one additional (blank) symbol from the tape. So, for quadruples of the form qi Sj Rq`
calling for motion to the right, Post introduced the productions:
Likewise for quadruples qi Sj Lq` calling for motion to the left, Post used the
productions:
22
In the formulation of Turing [71], the tape is infinite in only one direction and a machine’s
operations are specified by quintuples allowing for a change of symbol together with a motion to
the left or right as a single step. Of course this difference is not significant.
20 M. Davis and W. Sieg
So beginning with the initial string hq1 S0 h and applying these productions, a string
containing q will occur precisely when the given Turing machine has halted. In
such a case the “cleanup” productions lead to the string hNqh. Thus we see that a
given Turing machine beginning with a blank tape will eventually halt if and only
if, beginning with the initial string hq1 S0 h and applying this system of productions,
the string hNqh is eventually reached. But in fact, and crucially, we can claim more:
namely, a Turing machine beginning with a blank tape will eventually halt if and
only if hq1 S0 h hNqh in the Thue system obtained by adding the productions
obtained by interchanging the left and right side of each of the above productions.
Thus an algorithm to solve Thue’s problem could be used to determine whether the
given Turing machine will ever halt, and hence, there can be no such algorithm.
It will be helpful in discussing the above claim to refer to the original productions
as the forward productions and the new productions we have added as the reverse
productions. So, suppose that hq1 S0 h hNqh, and let the sequence of strings
hq1 S0 h D u1 ; u2 ; : : : ; un D hNqh
be the successive steps applying a mix of forward and reverse productions that
demonstrates that this is the case. Post showed how to eliminate the use of reverse
productions; namely, let the transition from us1 to us be the last occurrence of a
use of a reverse production. Then s < n because no reverse production can lead to
hNqh. So the transition from us to usC1 is via a forward production. But us1 can also
23
Post works with the closely related unsolvability of the problem of whether a particular
distinguished symbol will ever appear on the tape, because unlike the halting problem, it appears
explicitly in [71]. But dissatisfied with the rigor of Turing’s treatment, Post outlines his own proof
of that fact. He also includes a careful critique pointing out how a convention that Turing had
adopted for convenience in constructing examples had been permitted to undermine some of the
key proofs. Anyhow, for the present application to Thue’s problem, Post begins by deleting all
quadruples for which the distinguished symbol is the third symbol of the four, thus changing the
unsolvability to one of halting.
Conceptual Confluence in 1936: Post and Turing 21
be obtained from us via the forward production from which the reverse production
was formed that had enabled the transition from us1 to us . And by the construction,
corresponding to the deterministic character of Turing machines, only one forward
production is applicable to us . Hence, us1 D usC1 and nothing is lost if us and usC1
are simply deleted from the sequence. Repeating this process all uses of reverse
productions are successively eliminated.
Thue’s problem is also known as the word problem for semigroups. This is
because the concatenation of strings is an associative operation. The word problem
for cancellation semigroups is obtained if one adds the conditions
uv uw implies v w; vu wu implies v w;
where u; v; w are any strings. Adding this condition makes the word problem much
more complicated and therefore it is far more difficult to prove its unsolvability.
The proof of this unsolvability given in [74] is too intricate to say much about
the detailed constructions.24 But what is very relevant to the theme of this paper
is the extent to which Turing placed himself in the same tradition as Post. To
begin with he acknowledged the formulation in [49] as making Post an equal co-
originator of what we (and Post) call the Turing machine. He adopted much of
Post’s methodology including a two-way-infinite tape and the representation of a
stage in a computation by a finite string that begins and ends with the same special
symbol and includes a representation of the scanned square, the machine’s state, and
the tape content. In fact he makes a point of acknowledging Post’s invention of this
device as a significant contribution. One complication in Turing’s proof that might
be mentioned is that strings representing a particular stage of a computation now
come in two flavors: in addition to Post’s strings in which the symbol representing
a machine state is placed to the left of the scanned symbol, Turing also uses such
a string in which the state symbol is to the right of the scanned symbol. Turing did
acknowledge the validity of the critique in [54] referred to above, but clearly did not
attach the importance to it that Post apparently did.
We have seen earlier that in [75] we find a remarkable confluence in the
conceptual apparatus employed by Post and Turing. The technical mathematics
discussed in this section indicates how this apparatus allowed the intelligible and
vivid framing of specific mathematical problems and helped to make fruitful this
seeming synchronicity. In [75] (discussed extensively above) Turing had argued
that if there were a general method for determining whether the goal of any
given substitution puzzle is reachable, that method would itself take the form of
a substitution puzzle. Then using the Cantor diagonal method he concluded that
there is no such method, that the problem of finding such a method is unsolvable.
Combining all this with Turing’s aside to the effect that the task of finding a proof
24
Turing’s proof was published in the prestigious Annals of Mathematics. Eight years later an
analysis and critique of Turing’s paper [4] was published in the same journal. Boone found the
proof to be essentially correct, but needing corrections and expansions in a number of the details.
22 M. Davis and W. Sieg
7 Concluding Remarks
In the previous section, we saw that important unsolvability results were obtained
by reducing Turing machine computations to moves in suitable “unambiguous
substitution puzzles” and by exploiting the unsolvability of the halting or printing
problem. The joining of complementary techniques, perspectives and results is truly
amazing. Let us briefly return to the foundational problem that was addressed
in Sect. 5. The issue of Turing’s Central Thesis, associating with the informal
notion “finite symbolic configuration” the precise mathematical of “finite string of
symbols”, was not resolved. In their 1936 papers, both Turing and Post consider or
suggest considering larger classes of mathematical configurations; that move is to
make Turing’s Central Thesis inductively more convincing and turn Post’s working
hypothesis into a natural law. In [79], p. 643, it is asserted that “Post evidently had
plans to continue in a series of articles on the topic . . . showing how ‘Formulation
1’ could be extended to broader computational models.” Urquhart reports that some
work on a “Formulation 2” is contained in Post’s notebooks in Philadelphia. Post
considers there in particular “rules operating in a two-dimensional symbol space,
somewhat reminiscent of later work on cellular automata, such as Conway’s Game
of Life . . . ”.
Let us mention some later concrete work that involves such more general classes
of configurations. Kolmogorov and Uspenski considered in their [44] particular
kinds of graphs. Sieg and Byrnes [65] generalized the K&U-graphs to K-graphs and
conceived operations on them as being given by generalized Post production rules.
Finally, Gandy in [25] introduced discrete dynamical systems that also permitted,
ironically, modeling the parallel computations of Conway’s Game of Life and
other cellular automata. However, in these examples an appeal to the appropriate
Central Thesis can’t be avoided, if one wants to argue for the full adequacy of
the mathematically rigorous notion. The open-endedness of considering ever more
encompassing classes of configurations may have been the reason, why Turing in
[75] thought that the variant of his Thesis must remain indefinite and that this very
statement is one “which one does not attempt to prove”.25 This may also have
25
Gandy, in [25], uses particular discrete dynamical systems to analyze “machines”, i.e., discrete
mechanical devices, and “proves” a mechanical thesis (M) corresponding to Turing’s thesis.
Dershowitz and Gurevich in [22] give an axiomatization and claim to prove Church’s Thesis. (For
Conceptual Confluence in 1936: Post and Turing 23
been the reason for Post’s call for the “continual verification” of the natural law
that has been discovered. For Post this “continual verification” took the form of an
implied duty to explicitly prove that any process claimed on intuitive grounds to be
mechanically calculable be accompanied by a rigorous proof that the process falls
under one of the various equivalent explications that have been set forth. In [52]
this duty is not fulfilled in the printed account based on an invited address to the
American Mathematical Society. However, Post assures readers that “. . . with a few
exceptions . . . we have obtained formal proofs of all the consequently mathematical
theorems here developed informally.” Post goes on to say, “Yet the real mathematics
involved must lie in the informal development. For . . . transforming [the informal
proof] into the formal proof turned out to be a routine task.” Researchers have
long since given up any pretense that they subject their complex arguments to this
“continual verification” and no one suggests that this raises any doubt about the
validity of the results obtained.
Can one avoid the appeal to a “Central Thesis”? Sieg suggested in [60] a positive
answer, namely, by introducing a more abstract concept of computable processes;
that concept is rooted in Post and Turing’s way of thinking about mechanical, local
manipulations of finite symbolic configurations. Analogous abstraction steps were
taken in nineteenth century mathematics; a pertinent example is found already in
[18]. Call a set O an “ordered system” if and only if there is a relation R on O
such that (i) R is transitive, (ii) if for two elements x and y of O the relation R
holds, then there are infinitely many elements between x and y, and (iii) every
element x of O determines a cut.26 Consider the rational numbers with their ordinary
“x < y” relation and the geometric line with the relation “p is to the left of q”.
Both the specific sets with their relations fall under this abstract concept. There is
an abundance of such structural definitions throughout modern mathematics; for
example, groups, fields, topological spaces.
For some of the abstract notions representation theorems can be established
stating that every model of the abstract notion is isomorphic to a “concrete” model.
Here are two well known examples: every group is isomorphic to a permutation
group; every Boolean algebra is isomorphic to a Boolean algebra of sets. A suitable
representation theorem can also be proved for computable discrete dynamical
systems. In [60] the abstract notion of a Turing computor was introduced as a
computable discrete dynamical system (over hereditarily finite sets with a countably
infinite set of urelements), and it was established that the computations of any model
of the abstract notion can be reduced to computations of a Turing machine. What has
been achieved? Hilbert called the characteristic defining conditions for structures
a discussion of this claim, see [63].) In the context of our discussion here, one can say that Gandy
and Dershowitz & Gurevich introduce very general models of computations—“Gandy Machines”
in Gandy’s case, “Abstract State Machines” in Dershowitz and Gurevich’s case—and reduce them
to Turing machine computations.
26
A cut in O (determined by x) is a partition of O into two non-empty parts O1 and O2 , such that
all the elements of O1 stand in the relation R to all the elements of O2 (and x is taken to be in O1 ).
24 M. Davis and W. Sieg
“axioms”, and so do we when talking about the axioms for groups or rings. In that
sense, an axiomatic analysis of “mechanical procedures that can be carried out by
computors” can be given—building on natural boundedness and locality conditions.
The methodological problems have not been removed, but they have been deeply
transformed: they concern axioms, no longer statements whose status is somewhere
between a definition and a theorem; they are no longer unusual, but rather common
and difficult, as they ask us to assess the correctness and appropriateness of axioms
for an intended, albeit abstract concept. The central role Turing machines and
Post systems play for the theory of computability is secured by the representation
theorem.
References
21. L. DeMol, Generating, solving and the mathematics of Homo Sapiens. Emil Post’s views on
computation, in A Computable Universe: Understanding and Exploring Nature as Computa-
tion, ed. by H. Zenil, World Scientific Publishing, Singapore, 2013, pp. 45–62
22. N. Dershowitz, Y. Gurevich, A natural axiomatization of computability and proof of Church’s
thesis. Bull. Symb. Log. 14, 299–350
23. W.B. Ewald (ed.), From Kant to Hilbert: A Source Book in the Foundations of Mathematics,
vol. II (Oxford University Press, Oxford)
24. W.B. Ewald, W. Sieg (eds.), David Hilbert’s Lectures on the Foundations of Arithmetic and
Logic, 1917–1933 (Springer, Berlin)
25. R. Gandy, Church’s thesis and principles for mechanisms, in The Kleene Symposium, ed. by J.
Barwise, H.J. Keisler, K. Kunen (North-Holland Publishing Company, Amsterdam), pp. 123–
148
26. R. Gandy, The confluence of ideas in 1936, in The Universal Turing Machine: A Half-Century
Survey, ed. by R. Herken (Oxford University Press, Oxford), pp. 55–111
27. K. Gödel, Über formal unentscheidbare Sätze der Principia Mathematica und verwandter
Systeme I. Monatshefte für Mathematik und Physik 38, 173–198. [Reprinted [33] pp. 144–194
(even numbered pages). Translated in: [80] pp. 596–628; [12, 16] pp. 5–38; [33] pp. 145–195
(odd numbered pages)]
28. K. Gödel, On Undecidable Propositions of Formal Mathematical Systems. Notes on Lectures at
the Institute for Advanced Study, Princeton, ed. by S.C. Kleene, J.B. Rosser [Reprinted [12, 16]
pp. 41–74; [33] pp. 346–369]
29. K. Gödel, Über die Länge von Beweisen, Ergebnisse eines mathematischen Kolloquiums Heft,
vol. 7, pp. 23–24 [Reprinted [33] pp. 396–398 (even numbered pages) Translated in: [12, 16]
pp. 82–83; [33] pp. 397–399 (odd numbered pages)]
30. K. Gödel, Undecidable diophantine propositions, in [35], pp. 164–175
31. K. Gödel, Remarks before the Princeton bicentennial conference on problems in mathematics,
in [34], pp. 150–153
32. K. Gödel, Postscriptum to [28], in [12, 16], pp. 71–73; [33] pp. 369–371
33. K. Gödel, Collected Works, vol. I, ed. by S. Feferman et al. (Oxford University Press, Oxford)
34. K. Gödel, Collected Works, vol. II, ed. by S. Feferman et al. (Oxford University Press, Oxford)
35. K. Gödel, Collected Works, vol. III, ed. by S. Feferman et al. (Oxford University Press, Oxford)
36. K. Gödel, Collected Works, vol. V, ed. by S. Feferman et al. (Oxford University Press, Oxford)
37. J. Herbrand, Sur la non-contradiction de l’arithmétique. Crelles Journal für die reine und
angewandte Mathematik 166, 1–8 [Translated in [80], pp. 618–628]
38. D. Hilbert, Grundlagen der Geometrie, in Festschrift zur Feier der Enthüllung des Gauss-
Weber-Denkmals in Göttingen (Teubner, Leipzig), pp. 1–92
39. D. Hilbert, P. Bernays, Grundlagen der Mathematik, vol. II (Springer, Berlin)
40. S.C. Kleene, General recursive functions of natural numbers. Mathematische Annalen 112,
727–742 [Reprinted [12, 16], pp. 236–253]
41. S.C. Kleene, Recursive predicates and quantifiers. Trans. Am. Math. Soc. 53, 41–73
42. S.C. Kleene, Introduction to Metamathematics (Wolters-Noordhoff Publishing, Amsterdam)
43. S.C. Kleene, Origins of recursive function theory. Ann. Hist. Comput. 3, 52–67
44. A. Kolmogorov, V. Uspensky, On the definition of an algorithm. AMS Transl. 21(2), 217–245
45. C.I. Lewis, A Survey of Symbolic Logic (University of California Press, Berkeley)
46. A.A. Markov, On the impossibility of certain algorithms in the theory of associative systems.
Doklady Akademii Nauk S.S.S.R., n.s., 1951, 77, 19–20 (Russian); C. R.. Acad. Sci. de
l’U.R.S.S. , n.s., 55, 583–586 (English translation)
47. P. Martin-Löf, Notes on Constructive Mathematics (Almqvist & Wiksell, Stockholm)
48. E.L. Post, Introduction to a general theory of elementary propositions. Am. J. Math. 43, 163–
165 [Reprinted [80], pp. 265–283. Reprinted [55], pp. 21–43]
49. E.L. Post, Finite combinatory processes. Formulation I. J. Symb. Log. 1, 103–105 [Reprinted
[12, 16], pp. 289–291. Reprinted [55], pp.103–105]
50. E.L. Post, Absolutely unsolvable problems and relatively undecidable propositions: account of
an anticipation, in [12], pp. 340–433, [16], pp. 340–406, [55], pp. 375–441
26 M. Davis and W. Sieg
51. E.L. Post, Formal reductions of the general combinatorial decision problem. Am. J. Math. 65,
197–215 [Reprinted [55], pp. 442–460]
52. E.L. Post, Recursively enumerable sets of positive integers and their decision problems. Bull.
Am. Math Soc. 50, 284–316 [Reprinted [12, 16], pp.305–337; [55], pp.461–494]
53. E.L. Post, A variant of a recursively unsolvable problem. Bull. Am. Math. Soc. 52, 264–268
[Reprinted [55], pp. 495–500]
54. E.L. Post, Recursive unsolvability of a problem of thue. J. Symb. Log. 12, 1–11 [Reprinted
[12, 16], pp. 293–303; [55], pp. 503–513]
55. E.L. Post, Solvability, Provability, Definability: The Collected Works of Emil L. Post, ed. by D.
Martin (Birkhäuser, Basel)
56. P. Rosenbloom, The Elements of Mathematical Logic (Dover Publications, New York)
57. J.B. Rosser, Highlights of the history of the Lambda-Calculus. Ann. Hist. Comput. 6(4), 337–
349
58. W. Sieg, Mechanical procedures and mathematical experience, in Mathematics and Mind, ed.
by A. George (Oxford University Press, Oxford), pp. 71–117
59. W. Sieg, Step by recursive step: Church’s analysis of effective calculability. Bull. Symb. Log.
3, 154–180 [Reprinted (with a long Postscriptum) in Turing’s Legacy: Developments from
Turing’s Ideas in Logic, ed. by R. Downey. Lecture Notes in Logic (Cambridge University
Press), to appear in 2013]
60. W. Sieg, Calculations by man and machine: mathematical presentation, in In the Scope of
Logic, Methodology and Philosophy of Science. Volume one of the 11th International Congress
of Logic, Methodology and Philosophy of Science, Cracow, August 1999; P. Gärdenfors, J.
Wolenski, K. Kijania-Placek (eds.), Synthese Library, vol. 315 (Kluwer), pp. 247–262
61. W. Sieg, Only two letters: the correspondence between Herbrand and Gödel. Bull. Symb. Log.
11, 172–184 [Reprinted in K. Gödel - Essays for His Centennial, ed. by S. Feferman, C.
Parsons, S.G. Simpson. Lecture Notes in Logic (Cambridge University Press, 2010) pp. 61–73]
62. W. Sieg, On computability, in Philosophy of Mathematics, ed. by A. Irvine (Elsevier,
Amsterdam), pp. 535–630
63. W. Sieg, Axioms for computability: do they allow a proof of Church’s thesis? in A Computable
Universe: Understanding and Exploring Nature as Computation, ed. by H. Zenil, World
Scientific Publishing, Singapore, 2013, pp. 99–123
64. W. Sieg, Normal forms for puzzles: a variant of Turing’s thesis”, in [8], pp. 332–339
65. W. Sieg, J. Byrnes, K-graph machines: generalizing Turing’s machines and arguments, in
Gödel ‘96, ed. by P. Hajek. Lecture Notes in Logic, vol. 6 (A.K. Peters, Natick), pp. 98–119
66. T. Skolem, Begründung der elementaren Arithmetik durch die rekurrierende Denkweise
ohne Anwendung scheinbarer Veränderlichen mit unendlichem Ausdehnungsbereich. Viden-
skapsselkapets sktifter, I. Matematisk-naturvidenskabelig klass, no. 6 [Translated in [80]
pp. 302–333]
67. R. Smullyan, Theory of Formal Systems. Annals of Mathematics Studies, vol. 47 (Princeton
University Press, Princeton)
68. R. Soare, Computability and recursion. Bull. Symb. Log. 2(3), 284–321
69. R. Soare, Interactive computing and relativized computability, in [11], pp. 203–260
70. A. Thue, Probleme über Veränderungen von Zeichenreihen nach gegebenen Regeln. Skrifter
utgit av Videnskapsselskapet i Kristiania, I. Matematisk-naturvidenskabelig klasse, no. 10
71. A.M. Turing, On computable numbers with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. 42(2), 230–267 [Reprinted [12, 16] pp. 116–154. Reprinted [78] pp. 18–56.
Reprinted [9] pp. 58–90; 94–96. Reprinted [8] pp. 16–43]
72. A.M. Turing, Correction to [71]. Proc. Lond. Math. Soc. 43(2), 544–546
73. A.M. Turing, Systems of logic based on ordinals. Proc. Lond. Math. Soc. 45(2), 161–228
[Reprinted [12, 16], pp.154–222]
74. A.M. Turing, The word problem in semi-groups with cancellation. Ann. Math. 52, 491–505
[Reprinted [77], pp. 63–78. Reprinted [8], pp. 345–357]
75. A.M. Turing, Solvable and unsolvable problems. Sci. News 31, 7–23 [Reprinted [76], pp. 187–
203. Reprinted [9], pp. 582–595. Reprinted [8], pp. 322–331]
Conceptual Confluence in 1936: Post and Turing 27
76. A.M. Turing, Mechanical Intelligence: Collected Works of A.M. Turing, ed. by D.C. Ince
(North-Holland, Amsterdam)
77. A.M. Turing, Pure Mathematics: Collected Works of A.M. Turing, ed. by J.L. Britton (North-
Holland, Amsterdam)
78. A.M. Turing, Mathematical Logic: Collected Works of A.M. Turing, ed. by R.O. Gandy, C.E.M.
Yates (North-Holland, Amsterdam)
79. A. Urquhart, Emil Post, in Handbook of the History of Logic, ed. by D.M. Gabbay, J. Woods.
Logic from Russell to Church, vol. 5 (Elsevier, Amsterdam), pp. 617–666
80. J. van Heijenoort (ed.), From Frege to Gödel: A Sourcebook in Mathematical Logic, 1879–1931
(Harvard, Cambridge, 1967)
81. A.N. Whitehead, B. Russell, Principia Mathematica, vol. 1 (Cambridge University Press,
Cambridge)
Algorithms: From Al-Khwarizmi to Turing
and Beyond
Wolfgang Thomas
1 Prologue
1
On the occasion of the Alan Turing Year 2012, a new presentation of Turing’s work, including
hitherto unpublished papers, is available in the volume [5].
W. Thomas ()
RWTH Aachen, Lehrstuhl Informatik 7, 52056 Aachen, Germany
e-mail: thomas@informatik.rwth-aachen.de
In the work of Turing and his contemporaries, the terms “procedure”, “finite
process”, and (as mostly used by Turing) “machine” occur more often than
“algorithm”. All these terms, however, point to the same idea: a process of symbolic
computation fixed by an unambiguous and finite description.
The word “algorithm” originates in the medieval “algorism” as a recipe to
perform calculations with numbers, originally just natural numbers. “Algorism”
goes back to one of the most brilliant scientists of the islamic culture, Al-Khwarizmi
(around 780–850), who worked in the “House of Wisdom” of the Chalif of Bagdad.3
In this academy, founded by the Chalif Harun Al-Rashid and brought to culmination
by his son Al-Mamun, scientists were employed for a wide spectrum of activities,
among them translations (e.g. from Greek and Persian to Arabic), construction of
scientific instruments, expeditions, and—quite important—advice to the Chalif. Al-
Khwarizmi must have been a leading member. His full name (adding together all
name ingredients we know of) was Muhammad Abu-Abdullah Abu-Jafar ibn Musa
Al-Khwarizmi Al-Majusi Al-Qutrubbulli. The attribute “Al-Khwarizmi” points to
the province of “Choresmia”, located in today’s Usbekistan, where he probably
was born and grew up. He was sent by the Caliph to Egypt for an exploration
the giza pyramids, he undertook measurements (e.g., executing the experiment of
Eratosthenes to determine the diameter of the earth), and he wrote treatises.4
2
Among the more comprehensive sources we mention [6].
3
For an interesting account on the “House of Wisdom”, we recommend [11].
4
For a more detailed summary of Al Khwarizmi’s life see, e.g., [27].
Algorithms: From Al-Khwarizmi to Turing and Beyond 31
The most influential ones were his book on algebra (“Kitāb al-mukhtasar fi
hisab al-jabr wa’l-muqabala”) and his text “Computing with the Indian Numbers”
(“Kitāb al-Jam wa-l-tafrN{ q bi-hisāb al-Hind”). We concentrate here on the latter,
˘
P
in which he describes the execution of the basic operations of arithmetic (addition,
multiplication, and others) in the decimal number system. The Indian sources he
used are not known. Also the original text of Al-Khwarizmi seems to be lost. We
have translations into Latin, for example the famous manuscript of the thirteenth
century kept at the library of the University of Cambridge (England).5 This text,
however, is a bad example of scientific literature: Citations and comments are mixed
into a conglomerate, and also many places where the decimal ciphers should appear
remain empty. Probably the monk who wrote this text was eager to put everything
into a solid theological context, and he was not comfortable with writing down these
strange decimal symbols. Thus one has to guess at several places how the missing
example computations would look like that should clarify the textual descriptions.
It is amusing to read the phrase “but now let us return to the book”, indicating that
the author comes back to Al-Khwarizmi. And thus, many paragraphs start with the
repetitive phrase “Dixit Algorizmi”—which motivated the term “algorism” for the
procedures described in this work.
It is noteworthy that this concept of “algorithm” clearly refers to a process of
symbol manipulation, in contrast to calculations performed on the abacus. The
arrangement of pieces on the abacus also reflects the decimal system, but the
computation process there is not symbolic in the proper sense of the word.
A new dimension to symbolic computation was added by Gottfried Wilhelm
Leibniz (1646–1716). Extending ideas of precursors (among them Ramon Llull and
Anastasius Kircher), he developed the vision of calculating truths (true statements)
and not just numerical values. This vision was partly motivated by the fierce
theological disputes of his time, a phenomenon which was not just academic but
penetrated politics. Leibniz was born at the very end of the 30 years’ war that had
devastated Germany and that was partly rooted in theological conflicts between the
catholic and the protestant. Leibniz dreamed of a universal calculus that would help
philosophers in their disputes by just following the call “Calculemus!” He hints at
his concept of a “characteristica universalis” in a letter to Duke Johann Friedrich of
Braunschweig-Lüneburg 6 :
In philosophy I found some means to do, what Descartes und others did via Algebra and
Analysis in Arithmetic and Geometry, in all sciences by a combinatorial calculus [“per
5
A full presentation in facsimile with transcription to Latin is given in [26]; a translation to English
in [2].
6
Leibniz wrote this letter [13] of 1671 to a duke and not to a colleague; hence he used German
rather than Latin, with some Latin words inserted: “In Philosophia habe ich ein Mittel funden,
dasjenige was Cartesius und andere per Algebram et Analysin in Arithmetica et Geometria gethan,
in allen scientien zuwege zu bringen per Artem Combinatoriam [ : : : ]. Dadurch alle Notiones
compositae der ganzen welt in wenig simplices als deren Alphabet reduciret, und aus solches
alphabets combination wiederumb alle dinge, samt ihren theorematibus, und was nur von ihnen
zu inventiren müglich, ordinata methodo, mit der zeit zu finden, ein weg gebahnet wird.”
32 W. Thomas
Artem Combinatoriam”] [ : : : ]. By this, all composed notions of the world are reduced to
few simple parts as their Alphabet, and from the combination of such alphabet [letters] a
way is opened to find again all things, including their truths [“theorematibus”], and whatever
can be found about them, with a systematic method in due time.
Leibniz undertook only small steps in this huge project, but in a methodological
sense he was very clear about the task. As he suggests, logic should be applied by
procedures of “alphabet’s combination”, i.e., symbolic computation. And he was
very definite about his proposal to join the algorithmic procedures known from
arithmetic with logic. This idea of “arithmetization of logic” (which later Hilbert
pursued in his program to show the consistency of mathematics) is raised in two
ways:
In his paper “Non inelegans specimen demonstrandi in abstractis” of 1685 [15]
(“A not inelegant example of abstract proof method”), he develops the rudiments
of Boolean algebra, using equations such as “A C A D A” with “C” as a sign for
union. As an example, let us state the last theorem (XIII) of his note:
Si coincidentibus addendo alia fiant coincidentia, addita sunt inter se communicantia
i.e.,
If from two equal entities we get, by adjoining something, other but again equal entities,
then among the added parts there must be something in common
We see very clearly the idea to represent elementary concepts by prime numbers
and their conjunction by products of prime numbers, which allows to reestablish the
factors. This prepares the idea of Gödel numbering that entered the stage again 250
7
“Verbi gratia quia Homo est Animal rationale (et quia Aurum est metallum ponderosissimum)
hinc si sit Animalis (metalii) numerus a ut 2 (m ut 3) Rationalis (ponderosissimi) vero numerus r
ut 3 (p ut 5) erit numerus hominis seu h idem quot ar id est in hoc exemplo 2,3 seu 6 (et numerus
auri solis s idem quot mp id est in hoc exemplo 3,5 seu 15.”
Algorithms: From Al-Khwarizmi to Turing and Beyond 33
years later—using number theoretic facts to code complex objects (like statements
or proofs) by numbers—in a way that allows unique decomposition.
It is somewhat breathtaking to see how optimistic Leibniz was about the
realization of his ideas. In a note8 of 1677 he writes
When this language is introduced sometime by the missionaries, then the true religion which
is unified to the best with rationality, will be founded firmly, and one does not need to fear
a renunciation of man from it in the future, just as one does not need to fear a renunciation
from algebra and geometry.
This idea of a rational theory of ethics was shared by many of Leibniz’s contem-
poraries. As examples we just mention Spinoza’s treatise “Ethica. Ordine geometrio
demonstrata” (1677) and the dissertation “Philosophia practica universalis, methodo
mathematica conscripta” (1703) of Leibniz’s student Christian Wolff.
But more than his colleagues, Leibniz formulated rather bold promises—in a
very similar way as we do today when we apply for project money9:
I think that some selected people can do the job in five years, and that already after two
years they will reach a stage where the theories needed most urgently for life, i.e., moral
and metaphysics, are manageable by an unfallible calculus.
Leibniz’s dream in its full generality remained (and remains) unrealized. Surpris-
ingly, however, it was materialized in the domain of mathematics. This process
started with George Boole who developed the vague sketch of Leibniz into a proper
theory: “Boolean algebra”. The breakthrough in devising a universal scientific cal-
culus was then achieved by Gottlob Frege. His “Begriffsschrift” (1879) introduces
a formal language in which mathematical statements can be expressed, the essential
innovation being a clarification of the role of quantifiers and quantification. His own
work on the foundations of arithmetic, and in particular the subsequent enormous
effort undertaken by Russell and Whitehead in their “Principia Mathematica”,
opened a way to capture mathematics in a formal system.
But in this development the objectives connected with formalization shifted
dramatically. The objective was no longer the Leibnizian approach to compute truths
needed in life (or just in mathematics) but of a more methodological nature. The shift
occurred with the discovery of contradictions (“paradoxes”) in Frege’s system. The
most prominent problem was the contradiction found independently by Russell and
8
From [14]: “Nam ubi semel a Missionariis haec lingua introduce poterit, religio vera quae maxime
rationi consentanea est, stabilia erit et non magis in posterum metuanda erit Apostasia, quam ne
hominess Arithmeticam et Geometriam, quam semel dedicere, mox damnent.”
9
From [14]: “Aliquot selectos homines rem intra quinquennium absolvere posse puto; intra bien-
nium autem doctrinas, magis in vita frequentalas, id est Moralem et Metaphysicam, irrefragabile
calculo exhibebunt.”
34 W. Thomas
Zermelo inherent in the concept of a “set of those sets that do not contain themselves
as elements”. The formalization of mathematics was now pursued as a way to
show its consistency. As Hilbert formulated in his program on the foundations of
mathematics, the task was to analyze the combinatorial processes in formal proofs
and by such an analysis arrive at the result that the arithmetical equation 0 D 1,
for example, cannot be derived. In pursuing this program, the key issues were
axiomatizations of theories, the consistency of theories, and the soundness and
completeness of proof calculi.
The fundamental results of Gödel (completeness of the first-order proof calculus
and incompleteness of any axiomatic system of arithmetic) made it clear that only
in a fragmentary way there was hope to fulfill Hilbert’s program. An essential
ingredient in Gödel’s approach was the arithmetization of logic (today called
“Gödelization”), transforming Leibniz’s hint mentioned above into a powerful
method.
However, a positive result of the foundational research of the early twentieth
century was that the “atomic ingredients” of mathematical proofs, as condensed
in the rules of the proof calculus of first-order logic, were established. Together
with the axiomatization of set theory, a framework emerged in which most of
mathematics could be formally simulated. This framework clarified to a large extent
which kind of symbolic manipulations are necessary to do logic algorithmically—as
Al-Khwarizmi had explained this centuries before for numeric calculations. Most
remarkably, it was an algorithmic problem in the domain of logic (and not in the
domain of arithmetic) which motivated a general analysis of computability and
hence of “algorithm”.
This was “Hilbert’s Entscheidungsproblem”, as formulated in the monograph
“Einführung in die theoretische Logik” by Hilbert and Ackermann (1928).
Das Entscheidungsproblem ist gelöst, wenn man ein Verfahren kennt, das bei einem
vorgelegten logischen Ausdruck durch endlich viele Operationen die Entscheidung über
die Allgemeingültigkeit bzw. Erfüllbarkeit erlaubt.
Only today, in computer science, both traditions of “formal logic” are merged
again: In computer science, formal systems are set up in many ways, for example
as programming languages or as query languages for data bases, and in this design
questions of soundness, completeness, and complexity have to be addressed. But we
see at the same time the application of these formal systems of data processing to
solve concrete problems in virtually all sciences and domains of human life, very
much in the spirit of Leibniz.
4 Turing’s Breakthrough
A second remarkable aspect in Turing’s paper is the fact that after presenting
his model of Turing machine, he immediately exhibits a problem that is not
solvable with this model. For this, he develops the idea of a universal machine,
enters the technicalities of actually constructing one (and, as an aside, introduces
the programming technique today called “macros” for this purpose), and then
applies a diagonalization argument. This appearance of a powerful model connected
immediately with a corresponding unsolvability result should be compared with the
centuries that elapsed between the clear understanding of algebraic expressions (in
36 W. Thomas
Vieta’s time) and the proof of Abel that for polynomials of degree 5 one cannot in
general find solutions in this format.
In fact, the mere possibility to envisage algorithmically unsolvable problems
emerged only at a rather late stage. In 1900, in the formulation of Hilbert’s 10th
problem
Eine diophantische Gleichung mit irgendwelchen Unbekannten und mit ganzen rationalen
Zahlenkoeffizienten sei vorgelegt: Man soll ein Verfahren angeben, nach welchem sich
mittels einer endlichen Anzahl von Operationen entscheiden lässt, ob die Gleichung in
ganzen rationalen Zahlen lösbar ist.
One just finds the task to develop a “procedure” (“Verfahren”). The earliest place
in mathematical literature where the certainty about algorithmic solutions is put into
doubt seems to be a most remarkable paper by Axel Thue of 1910 (“Die Lösung
eines Spezialfalles eines allgemeinen logischen Problems” [23]). He formulates the
fundamental problem of term rewriting: Given two terms s, t and a set of axioms as
equations between terms, decide whether from s one can obtain t by a finite number
of applications of the axioms. He resorts to a special case in order to provide a partial
solution. About the general case one finds the following prophetic remark:
A solution of this problem in the most general case might perhaps be connected with
unsurmountable difficulties.10
It is a pity that this brilliant paper remained unnoticed for decades; one reason
for this is perhaps its completely uninformative title. (A detailed discussion is given
in [21].)
The work of Turing and his contemporaries Church, Kleene, Post finished a
struggle of many centuries for an understanding of “algorithm” and its horizon of
applicability, termed “computability”. This success was possible by a merge of two
traditions in symbolic computation: arithmetic and logic. The impression that an
unquestionable final point was reached with this work was underlined by Gödel,
who stated in 1946, 10 years after Turing’s breakthrough [7]:
Tarski has stressed [ : : : ] (and I think justly) the great importance of the concept of general
recursiveness (or Turing’s computability). It seems to me that this importance is largely due
to the fact that with this concept one has for the first time succeeded in giving an absolute
definition of an interesting epistemological notion, i.e., one not depending on the formalism
chosen. [ : : : ] By a kind of miracle it is not necessary to distinguish orders.
The year 1936 not only marks a point of final achievement but is at the same time
the initialization of a new and rapidly developing discipline: computer science (or
10
“Eine Lösung dieser Aufgabe im allgemeinsten Falle dürfte vielleicht mit unüberwindlichen
Schwierigkeiten verbunden sein.”
Algorithms: From Al-Khwarizmi to Turing and Beyond 37
Turing’s analysis (as well as the parallel work of Post) refers to procedures
that work on finite words composed from symbols taken from a finite alphabet.
As noted by Turing, the algorithms of basic arithmetic can be treated in this
framework. However, a standard first-year course in computer science, usually titled
“Algorithms and data structures”, already shows several chapters that go beyond
this domain. In particular, we deal there with algorithms over trees and over graphs
rather than words. In this generalized setting, some features of algorithms arise that
are hidden when we work over words. For example, over graphs we observe the lack
of a natural ordering (e.g. for the set of vertices). This lack of order allows to say
“Pick an element in the set V of vertices : : : ” without the requirement (and indeed,
without the possibility) of fixing a particular element. Over words, the situation is
different: Picking a letter in a word always implies the possibility to pick a particular
(e.g., the first) letter. As Gurevich and others remarked, the Turing model working
with the substrate of words on a tape does not allow us to deal with algorithms on
the adequate level of abstraction. The machinery of coding (by words) that enables
us to make a bridge to Turing’s model spoils this adequacy. To settle this problem,
a generalized view on algorithms was developed in the model of “abstract state
machine” [8]. It has the flexibility that is needed to answer the challenge of very
38 W. Thomas
11
Cited from [10, p. 117].
Algorithms: From Al-Khwarizmi to Turing and Beyond 39
The deeply troubling perspective of programmed robots designed for military com-
bat (on earth and on air) was discussed with the subtitle “The moral of algorithms”
in a leading German newspaper.12 In connection with the comprehensive analysis
of data on the web (covering millions of persons) by government agencies, the term
“the tyranny of algorithms” was used in an article of the same newspaper,13 and,
finally, the controversy in this discussion was condensed by “Spiegel Online” into
the remarkable headline “Freedom against algorithms”.14
While it is clear to the experts that current implementations of computer systems
ultimately rely on small computation steps in microprocessors and are thus in
principle reducible to Turing machine computations, we see that in the public
discussion the actual understanding of “algorithms” drastically exceeds the content
of the Turing model—it is today located on a much more general level.
12
F. Rieger, Das Gesicht unserer Gegner von morgen, Frankfurter Allgemeine Zeitung, 20th Sept.
2012.
13
G. Baum, Wacht auf, es geht um die Menschenwürde, Frankfurter Allgemeine Zeitung, 16th June
2013.
14
“Freiheit gegen Algorithmen”, Spiegel Online, 21st June 2013.
15
This aspect, with a focus on the role of algorithmic game theory, is developed at length by F.
Schirrmacher, a leading German journalist, in [20], a bestseller on the German book market.
Algorithms: From Al-Khwarizmi to Turing and Beyond 41
References
1. L. Blum, M. Shub, S. Smale, On a theory of computation and complexity over the real numbers:
NP-completeness, recursive functions and universal machines. Bull. Am. Math. Soc. 21, 1–46
(1989)
2. J.N. Crossley, A.S. Henry, Thus spake al-KhwārizmN{ : a translation of the text of Cambridge
University Library Ms. Ii.vi.5. Hist. Math. 17, 103–131 (1990)
3. A. Church, A note on the Entscheidungsproblem. J. Symb. Log. 1, 40–41 (1936)
4. A. Church, in: Summaries of the Summer Institute of Symbolic Logic. Application of recursive
arithmetic to the problem of circuit synthesis, vol. I (Cornell University, Ithaca, 1957), pp.
3–50
5. B. Cooper, J.V. Leeuwen (eds.), Alan Turing: His Work and Impact (Elsevier, Amsterdam,
2013)
6. M. Davis, The Universal Computer – The Road from Leibniz to Turing. Turing Centennial
Edition (CRC Press, Boca Raton, 2012)
7. K. Gödel, Remarks before the Princeton bicentennial conference on problems in mathematics,
in Kurt Gödel, Collected Works, ed. by S. Feferman et al., vol. II (Oxford University Press,
Oxford, 1990), pp. 150–153
8. Y. Gurevich, Sequential abstract-state machines capture sequential algorithms. ACM Trans.
Comput. Log. 1, 77–111 (2000)
9. H. Herring (ed.), G.W. Leibniz Schriften zur Logik und zur philosophischen Grundlegung von
Mathematik und Naturwissenschaft (lat. u. deutsch) (Suhrkamp, Frankfurt, 1996)
10. A. Hodges, Alan Turing: The Enigma (Vintage, London, 1992)
11. J. Al-Khalili, The House of Wisdom: How Arabic Science Saved Ancient Knowledge and Gave
Us the Renaissance (Penguin Press, New York, 2011)
12. S.C. Kleene, Representation of events in nerve nets and finite automata, in Automata Studies,
ed. by C.E. Shannon, J. McCarthy (Princeton University Press, Princeton, 1956), pp. 3–41
13. G.W. Leibniz, Brief an Herzog Johann Friedrich von Braunschweig-Lüneburg (Okt. 1671),
in Philosophische Schriften von Gottfried Wilhelm Leibniz, ed. by C.I. Gerhardt, vol. 1
(Weidmannsche Buchhandlung, Berlin, 1875), pp. 57–58
14. G.W. Leibniz, Anfangsgründe einer allgemeinen Charakteristik, in [9], pp. 39–57
15. G.W. Leibniz, Ein nicht unelegantes Beispiel abstrakter Beweisführung, in [9], pp. 153–177
16. G.W. Leibniz, Elemente eines Kalküls, in [9], pp. 67–91
17. E.L. Post, Finite combinatory processes – formulation 1. J. Symb. Log. 1, 103–105 (1936)
18. E.L. Post, Recursively enumerable sets of positive integers and their decision problems. Bull.
Am.. Math. Soc. 50, 284–316 (1944)
19. E.L. Post, A variant of a recursively unsolvable problem. Bull. Am. Math. Soc. 52, 264–268
(1946)
20. F. Schirrmacher, EGO: Das Spiel des Lebens (Karl Blessing-Verlag, München, 2013)
21. M. Steinby, W. Thomas, Trees and term rewriting in 1910: on a paper by Axel Thue. Bull. Eur.
Assoc. Theor. Comput. Sci. 72, 256–269 (2000)
22. W. Thomas. Infinite games and verification, in Proceedings of International Conference on
Computer Aided Verification CAV’02. Lecture Notes in Computer Science, vol. 2404 (Springer,
Berlin, Heidelberg, New York, 2002), pp. 58–64
23. A. Thue, Über die Lösung eines Spezialfalls eines allgemeinen logischen problems. Kristiania
Videnskabs-Selskabets Skrifter. I. Mat. Nat. Kl. 1910, No. 8
24. A.M. Turing, On computable numbers, with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. 42, 230–265 (1936)
25. A.M. Turing, Computing machinery and intelligence. Mind 59, 433–460 (1950)
42 W. Thomas
26. K. Vogel, Mohammed ibn Musa Alchwarizmi’s Algorismus. Das früheste Lehrbuch zum
Rechnen mit indischen Ziffern (Zeller, Aalen, 1963)
27. H. Zemanek, Dixit algorizmi: his background, his personality, his work, and his influence,
in Algorithms in Modern Mathematics and Computer Science, ed by A Ershov, D Knuth.
Proceedings, Urgench, Uzbek SSR, 16–22 September 1979. Springer Lecture Notes in
Computer Science, vol. 122 (Springer, Berlin, 1981), pp. 1–81
The Stored-Program Universal Computer: Did
Zuse Anticipate Turing and von Neumann?
Abstract This chapter sets out the early history of the stored-program concept.
The several distinct ‘onion skins’ making up the concept emerged slowly over a
ten-year period, giving rise to a number of different programming paradigms. A
notation is developed for describing different aspects of the stored-program concept.
Theoretical contributions by Turing, Zuse, Eckert, Mauchly, and von Neumann are
analysed, followed by a comparative study of the first practical implementations
of stored-programming, at the Aberdeen Ballistic Research Laboratory in the US
and the University Manchester in the UK. Turing’s concept of universality is also
examined, and an assessment is provided of claims that various historic computers—
including Babbage’s Analytical Engine, Flowers’ Colossus and Zuse’s Z3—were
universal. The chapter begins with a discussion of the work of the great German
pioneer of computing, Konrad Zuse.
1 Introduction
To Konrad Zuse belongs the honour of having built the first working program-
controlled general-purpose digital computer. This machine, later called Z3, was
functioning in 1941.1 Zuse was also the first to hire out a computer on a commercial
basis: as Sect. 2 explains, Zuse’s Z4 was rented by the Swiss Federal Institute of
Technology (ETH Zurich) for five years, and provided the first scientific computing
service in Continental Europe.
Neither Z3 nor Z4 were electronic computers. These machines were splendid
examples of pre-electronic relay-based computing hardware. Electromechanical
relays were used by a number of other early pioneers of computing, for example
Howard Aiken and George Stibitz in the United States, and Alan Turing at
Bletchley Park in the United Kingdom. Bletchley’s relay-based ‘Bombe’ was a
parallel, special-purpose electromechanical computing machine for codebreaking
(though some later-generation Bombes were electronic).2 Aiken’s giant relay-
based Automatic Sequence Controlled Calculator, built by IBM in New York and
subsequently installed at Harvard University (known variously as the IBM ASCC
and the Harvard Mark I) had much in common with the earlier Z3.
From an engineering point of view, the chief differences between the electro-
magnetic relay and electronic components such as vacuum tubes stem from the fact
that, while the vacuum tube contains no moving parts save a beam of electrons,
the relay contains mechanical components that move under the control of an
1
Zuse, K. ‘Some Remarks on the History of Computing in Germany’, in Metropolis, N., Howlett,
J., Rota, G. C. (eds) A History of Computing in the Twentieth Century (New York: Academic Press,
1980).
2
For additional information about the Bombe, including Gordon Welchman’s contributions, and
the earlier Polish Bomba, see Copeland, B. J., Valentine, J., Caughey, C. ‘Bombes’, in Copeland,
B. J., Bowen, J., Sprevak, M., Wilson, R., et al., The Turing Guide (Oxford University Press),
forthcoming in 2016.
The Stored-Program Universal Computer 45
electromagnet and a spring, in order to make and break an electrical circuit. Vacuum
tubes achieve very much faster digital switching rates than relays can manage.
Tubes are also inherently more reliable, since relays are prone to mechanical
wear (although tubes are more fragile). A small-scale electronic digital computer,
containing approximately 300 tubes, was constructed in Iowa during 1939–42 by
John V. Atanasoff, though Atanasoff’s machine never functioned satisfactorily. The
first large-scale electronic digital computer, Colossus, containing about 1600 tubes,
was designed and built by British engineer Thomas H. Flowers during 1943, and was
installed at Bletchley Park in January 1944, where it operated 24/7 from February
of that year.3
Zuse’s Z3 and Z4, like Aiken’s ASCC, and other relay-based computers built
just prior to, or just after, the revolutionary developments in digital electronics
that made the first electronic computers possible, were a final luxuriant flowering
of this soon-to-be-outdated computing technology (though for purposes other than
computing, relays remained in widespread use for several more decades, e.g. in
telephone exchanges and totalisators). Outmatched by the first-generation electronic
machines, Zuse’s computer in Zurich and Aiken’s at Harvard nevertheless provided
sterling service until well into the 1950s. While relay-based computers were
slower than their electronic rivals, the technology still offered superhuman speed.
Electromechanical computers carried out in minutes or hours calculations that
would take human clerks weeks or months.
It was not just the absence of digital electronics that made Z3, Z4 and ASCC
pre-modern rather than modern computers. None incorporated the stored-program
concept, widely regarded as the sine qua non of the modern computer. Instructions
were fed into the ASCC on punched tape. This programming method echoed
Charles Babbage’s nineteenth-century scheme for programming his Analytical
Engine, where instructions were to be fed into the Engine on punched cards
connected together with ribbon so as to form a continuous strip—a system that
Babbage had based on the punched-card control of the Jacquard weaving loom. If
the calculations that the ASCC was carrying out required the repetition of a block of
instructions, this was clumsily achieved by feeding the same instructions repeatedly
through the ASCC’s tape-reader, either by punching multiple copies of the relevant
block of instructions onto the tape or, if the calculation permitted it, by gluing
the tape ends together to form a loop.4 Zuse’s Z3 and Z4 also had punched tape
programming (Zuse preferred cine film to paper).5 In a stored-program computer,
on the other hand, the same instructions can be selected repeatedly and fed from
3
Copeland, B. J. et al. Colossus: The Secrets of Bletchley Park’s Codebreaking Computers (Oxford:
Oxford University Press, 2006, 2010).
4
Campbell, R. V. D. ‘Aiken’s First Machine: The IBM ASCC/Harvard Mark I’, in Cohen, I. B.,
Welch, G. W. (eds) Makin’ Numbers: Howard Aiken and the Computer (Cambridge, Mass.: MIT
Press, 1999), pp. 50–51; Bloch, R. ‘Programming Mark I’, in Cohen and Welch, Makin’ Numbers,
p. 89.
5
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 615; see also the photograph
of a segment of Zuse’s cine film tape in the Konrad Zuse Internet Archive, http://zuse.zib.de/
46 B.J. Copeland and G. Sommaruga
Although Z3 and Z4 (like their predecessors Z1 and Z2) used punched-tape pro-
gram control, there have always been rumours in the secondary literature that Zuse
independently invented the stored-program concept, perhaps even prior to Turing’s
classic 1936 exposition of the concept and the extensive further development of
it by both Turing and John von Neumann in 1945. Nicolas Jequier, for example,
described Z3 as the first computer to have an ‘[i]nternally stored program’.6 Jürgen
Schmidhuber recently wrote in Science:
By 1941, Zuse had physically built the first working universal digital machine, years ahead
of anybody else. Thus, unlike Turing, he not only had a theoretical model but actual working
hardware.7
Computer historians Brian Carpenter and Robert Doran are more cautious, saying
only that
The stored program concept—that a computer could contain its program in its own
memory—derived ultimately from Turing’s paper On Computable Numbers, and Konrad
Zuse also developed it in Germany, in the form of his Plankalkül language, without having
read On Computable Numbers.9
9
Carpenter, B. E., Doran, R. W. ‘Turing’s Zeitgeist’, in Copeland, Bowen, Sprevak, Wilson et al.,
The Turing Guide.
10
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 621; Zuse interviewed by
Christopher Evans in 1975 (‘The Pioneers of Computing: An Oral History of Computing’, London:
Science Museum; © Board of Trustees of the Science Museum).
11
Konrad Zuse interviewed by Uta Merzbach in 1968 (Computer Oral History Collection, Archives
Centre, National Museum of American History, Washington D.C.).
12
Zuse, K. Patent Application Z23139, ‘Verfahren zur selbsttätigen Durchführung von Rechnungen
mit Hilfe von Rechenmaschinen’ [Procedure for the automatic execution of calculations with the
aid of calculating machines], 9 April 1936, Deutsches Museum Archiv, document reference NL
207/00659; Zuse, K. Patent Application Z23624, ‘Rechenmaschine’ [Calculating machine], 21
December 1936, Deutsches Museum Archiv, NL 207/0991; Zuse, K. Patent Application Z391,
‘Rechenvorrichtung’ [Calculating device], 1941, in the Konrad Zuse Internet Archive, http://zuse.
48 B.J. Copeland and G. Sommaruga
The fact that Zuse wrote in German has always presented an obstacle to the
dissemination of his achievements among Anglophone historians. Indeed, much
of Zuse’s unpublished work is hand written, in a form of old-fashioned German
shorthand. We present English translations of key passages from Zuse’s documents
(so far as we know, for the first time).
Our conclusion will be that the truth lies somewhere between Schmidhuber’s
statements and the more cautious statement by Carpenter and Doran. Their cautious
statement is true, but there is more to be said. We cannot, however, endorse
Schmidhuber’s or Jequier’s claims.
The structure of this chapter is as follows. After a short overview of Zuse’s life
and work in Sect. 2, based largely on Zuse’s own accounts of events in tape-recorded
interviews given in 1968 and 1975, Sect. 3 goes on to provide a comparative account
of Turing’s and von Neumann’s contributions to the stored-program concept. Both
men made fundamental and far-reaching contributions to the development of this
keystone concept. In the voluminous secondary literature, however, von Neumann’s
contributions are generally exaggerated relative to Turing’s, even to the extent
that many accounts describe von Neumann as the inventor of the stored-program
concept, failing altogether to mention Turing. Section 3 explains why this von
Neumann-centric view is at odds with the historical record, and describes in detail
the respective contributions made by Turing and von Neumann to the development
of the concept during the key decade 1936–1946. Section 3 also discusses aspects
of the work of the many others who contributed, in one way or another, to the
development of this concept, including Eckert, Mauchly, Clippinger, Williams, and
Kilburn.
Section 4 offers a fresh look at the stored-program concept itself. Six pro-
gramming paradigms, that existed side by side during the decade 1936–1946, are
distinguished: these are termed P1–P6. P3–P6 form four ‘onion skins’ of the stored-
program concept and are of special interest. Equipped with this logical analysis, and
also with the historical contextualization provided in Sect. 3, Sects. 5 and 6 turn to
a detailed examination of unpublished work by Zuse, from the period 1936–1941.
Finally, Sect. 7 summarizes our conclusions concerning the multifaceted origins of
the stored-program concept.
‘It was a foregone conclusion for me, even in childhood, that I was to become an
engineer’, Zuse said.13 Born in Berlin on 22 June 1910, he grew up in East Prussia
and then Silesia (now lying mostly within the borders of Poland).14 His father was a
civil servant in the German Post Office. Young Konrad initially studied mechanical
engineering at the Technical University in Berlin-Charlottenburg, but switched to
architecture and then again to civil engineering.15 Graduating from the Technical
University in 1935, with a diploma in civil engineering, he obtained a job as a
structural engineer at Henschel-Flugzeugwerke AG (Henschel Aircraft Company)
in Schonefeld, near Berlin.
A determined young man with a clear vision of his future, Zuse left Henschel
after about a year, in order to pursue his ambition of building an automatic digital
binary calculating machine.16 As a student, Zuse had become painfully aware that
engineers must perform what he called ‘big and awful calculations’.17 ‘That is really
not right for a man’, he said.18 ‘It’s beneath a man. That should be accomplished
with machines.’ He started to rough out designs for a calculating machine in 1934,
while still a student, and with his departure from Henschel set up a workshop in the
living room of his parents’ Berlin apartment.19 There Zuse began constructing his
first calculator, in 1936.20 His ‘parents at first were not very delighted’, Zuse said
drily.21 Nevertheless they and his sister helped finance the project, and Kurt Pannke,
the proprietor of a business manufacturing analogue calculating machines, helped
out as well with small amounts of money.22 Some of Zuse’s student friends chipped
in, too, and they also contributed manpower—half a dozen or more pairs of hands
assisted with the construction of Zuse’s first machine.23
13
Zuse interviewed by Merzbach.
14
Zuse interviewed by Merzbach.
15
Zuse, K. Der Computer – Mein Lebenswerk [The computer—my life’s work] (Berlin: Springer,
4th edn, 2007), p. 13.
16
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in Germany’,
p. 612.
17
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 611.
18
Zuse interviewed by Merzbach.
19
Zuse interviewed by Merzbach; Zuse interviewed by Evans; Zuse, ‘Some Remarks on the History
of Computing in Germany’, p. 612.
20
Zuse, ‘Some Remarks on the History of Computing in Germany’, pp. 612–613.
21
Zuse interviewed by Evans.
22
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
23
Zuse interviewed by Merzbach.
50 B.J. Copeland and G. Sommaruga
Zuse’s Z1 computer, in a Berlin apartment belonging to Zuse’s parents. Credit: ETH Zurich
Later named Z1, his first calculator was completed in 1938, but never worked
properly.24 Z1 was purely mechanical.25 Zuse said that its storage unit functioned
successfully, and although the calculating unit could, Zuse recollected, multiply
binary numbers and do floating-point arithmetic, it was prone to errors.26 A
significant problem was that the punched-tape program control was defective and
Z1’s various units never functioned together as a whole.27
Zuse believed initially that a mechanical calculator would be more compact
than a relay-based machine.28 Nevertheless, he set out detailed ideas concerning
an electromechanical computer as early as 1936, as Sect. 6 describes. By 1938, the
difficulties with Z1’s calculating unit had convinced him of the need to follow an
electromechanical path, and he built Z2, a transitional machine.29 While the storage
unit remained mechanical, the calculating unit was constructed from relays.30
According to his son, Horst, Zuse used 800 telephone relays in the calculating unit.31
24
Zuse interviewed by Merzbach. See also Rojas, R. ‘Konrad Zuse’s Legacy: The Architecture of
the Z1 and Z3’, IEEE Annals of the History of Computing, vol. 19 (1997), pp. 5–16.
25
Zuse, ‘Some Remarks on the History of Computing in Germany’, pp. 613, 615.
26
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
27
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
28
Zuse interviewed by Evans.
29
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in Germany’,
p. 613.
30
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 615.
31
Zuse, H. ‘Konrad Zuse Biographie’, www.horst-zuse.homepage.t-pnline.de/kz-bio.html, p. 1.
The Stored-Program Universal Computer 51
Z2 was completed in 1939, the same year that Aiken and IBM produced the first
detailed circuit drawings for the ASCC.32 Z2 was binary, controlled by punched
tape, and offered fixed-point arithmetic. Little more than an experiment in relay
technology, the tiny computer had only 16 binary digits of storage.33 It ‘didn’t work
properly’, Zuse said.34 The problem was the relays. For economy’s sake, he had
bought used ones which he attempted to refurbish, but he set the contact pressure
too low.35
When war came, in 1939, Zuse was drafted into the army.36 But he saw no
fighting, and in fact spent less than a year as a soldier.37 Herbert Wagner, head
of a department at Henschel that was developing flying bombs, urgently needed a
statistician, and managed to arrange for Zuse to be released from the military.38 Back
in Berlin, Zuse was able to start work again on his calculating machine. At first this
was only ‘at night, Saturday afternoons and Sundays’, he said, but then Henschel’s
aviation research and development group became interested.39 Suddenly Zuse was
given additional resources.
By 1941 he was able to set up a business of his own, while continuing to
work part-time as a statistician. K. Zuse Ingenieurbüro und Apparatebau (K. Zuse
Engineering and Machine-Making Firm) had a workshop in Berlin and ultimately
a staff of about twenty.40 The workshop had to be moved three or four times, as
buildings succumbed to the bombing.41 According to Zuse, his Ingenieurbüro was
the only company in wartime Germany licensed to develop calculators.42 Various
armament factories financed his work, as well as the Deutsche Versuchsanstalt für
Luftfahrt (German Institute for Aeronautical Research, DVL). According to a 2010
article in Der Spiegel, the DVL provided over 250,000 Reichsmarks for Zuse’s
calculator research (approximately 2 million US dollars in today’s terms).43 Der
Spiegel wrote: ‘The civil engineer was more deeply involved in the NS [National
Socialist] arms industry than was believed hitherto. His calculating machines were
32
Campbell, ‘Aiken’s First Machine’, p. 34. Zuse wrote that Z2 was completed in 1939 (in ‘Some
Remarks on the History of Computing in Germany’, p. 615); Rojas, however, gave 1940 as the
completion date (Rojas, R. ‘Zuse, Konrad’, p. 3, zuse.zib.de/item/RuvnRJScXfvdt7BA).
33
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
34
Zuse interviewed by Merzbach.
35
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
36
Zuse interviewed by Merzbach.
37
Zuse, Der Computer – Mein Lebenswerk, pp. 50, 57.
38
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in Germany’,
p. 612.
39
Zuse interviewed by Merzbach.
40
Zuse interviewed by Merzbach.
41
Zuse interviewed by Merzbach.
42
Zuse, Der Computer – Mein Lebenswerk, p. 68.
43
Schmundt, H. ‘Rassenforschung am Rechner’, Der Spiegel, Nr. 24 (2010) (14 June 2010), pp.
118–119.
52 B.J. Copeland and G. Sommaruga
considered important for the “final victory”’.44 German historian Hartmut Petzold
previously gave a lower figure, saying the DVL provided 50,000 Reichsmarks for
Zuse’s work.45
At Henschel, Zuse was involved with the Hs 293 rocket-propelled missile. He
designed two special-purpose calculating machines, named S1 and S2, to assist
with the manufacture of these weapons.46 Air-launched and radio-controlled, the
missiles were built on an assembly line (at a rate of one every ten minutes, Zuse
estimated), and then during a final stage of production, a complicated system
of sensors monitored the wings, while their geometry was fine-tuned.47 Several
hundred sensors were used to achieve the aerodynamic accuracy required for
guiding the missile by radio. Initially, calculations based on the sensor data were
done by hand, using a dozen Mercedes calculating machines and working day and
night.48 Each individual calculation ‘took hours and hours’, Zuse remembered.49 He
built S1 to automate these calculations. S1 was a relay-based binary calculator with
a wired program, set by means of rotary switches.50 He recollected completing the
prototype, containing about 800 relays, in 1942.51 Eventually there were three S1s,
‘running day and night for several years’, Zuse said.52 Operators still had to enter
the sensor data by hand, using a keyboard. The later S2 was designed to eliminate
this data-entry stage, by connecting the sensors’ outputs directly to the calculator,
via a form of analog-to-digital converter.53 Zuse explained that S2 was completed
in 1944, but never became operational, because the factory (in Sudetenland) was
dismantled just as the computer became ready.54
Zuse began building his fully electromechanical Z3 in 1940, again in his parents’
living room, and completed it in 1941.55 Z3’s speed was on average one operation
per second, Zuse recollected, and the memory unit had 64 storage cells.56 The
44
Schmundt, ‘Rassenforschung am Rechner’, p. 119.
45
Petzold, H. Moderne Rechenkünstler: Die Industrialisierung der Rechentechnik in Deutschland
(Munich: C. H. Beck, 1992), pp. 193–4, 201 ff.
46
Zuse, Der Computer – Mein Lebenswerk, p. 54.
47
Zuse interviewed by Merzbach; Zuse interviewed by Evans; Zuse, ‘Some Remarks on the History
of Computing in Germany’, p. 619.
48
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
49
Zuse interviewed by Merzbach.
50
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in Germany’,
p. 615.
51
Zuse interviewed by Evans.
52
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 615; Zuse interviewed by
Merzbach.
53
Zuse interviewed by Merzbach.
54
Zuse interviewed by Evans.
55
Zuse, ‘Some Remarks on the History of Computing in Germany’, pp. 613, 615; Zuse interviewed
by Merzbach; Zuse interviewed by Evans.
56
Zuse interviewed by Evans.
The Stored-Program Universal Computer 53
57
Campbell, ‘Aiken’s First Machine’, p. 55; Bashe, C. ‘Constructing the IBM ASCC (Harvard
Mark I)’, in Cohen and Welch, Makin’ Numbers, p. 74.
58
Zuse, Der Computer – Mein Lebenswerk, p. 55; Zuse, ‘Some Remarks on the History of
Computing in Germany’, p. 615.
59
‘Operations of the 6812th Signal Security Detachment, ETOUSA’, 1 October 1944 (US National
Archives and Records Administration, College Park, Maryland, RG 457, Entry 9032, Historic
Cryptographic Collection, Pre–World War I Through World War II, Box 970, Nr. 2943), pp. 82–84;
Campbell, R. V. D., Strong, P. ‘Specifications of Aiken’s Four Machines’, in Cohen and Welch,
Makin’ Numbers, p. 258.
60
Zuse interviewed by Merzbach.
61
Zuse interviewed by Merzbach.
62
Zuse interviewed by Merzbach.
63
Zuse interviewed by Merzbach.
64
Zuse interviewed by Merzbach.
65
Zuse interviewed by Merzbach.
66
Zuse interviewed by Evans.
67
Zuse interviewed by Merzbach.
68
Zuse interviewed by Merzbach.
54 B.J. Copeland and G. Sommaruga
69
Zuse interviewed by Merzbach.
70
Zuse interviewed by Merzbach; Bauer, F. L. ‘Between Zuse and Rutishauser—The Early
Development of Digital Computing in Central Europe’, in Metropolis, Howlett and Rota, A History
of Computing in the Twentieth Century, p. 505.
71
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
72
Zuse interviewed by Merzbach.
73
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
74
Zuse interviewed by Merzbach; Zuse interviewed by Evans.
75
Horst Zuse in conversation with Copeland.
76
Zuse, K. Rough notes on the Plankalkül, no date, probably 1944 or the early months of 1945,
Deutsches Museum Archiv, NL 207/0783 and NL 207/0887.
77
Zuse interviewed by Evans.
The Stored-Program Universal Computer 55
remained until 1949.78 An option with the German branch of Hollerith helped tide
him over, he said—a couple in Hopferau had told an acquaintance at Hollerith of
the ‘strange inventor’ in the village.79 1946 saw the start of his Zuse-Ingenieurbüro
in Hopferau.80 Another early contract was with Remington Rand Switzerland. He
explained that the money enabled him to enlarge his company and to employ two
or three men. His small factory produced a program-controlled relay calculator for
Remington Rand. Named the M9, this was attached to punched card equipment.
According to Zuse about thirty were delivered and Remington’s clients used them
in Switzerland, Germany and Italy.81
Zuse’s computer workshop in the alpine village of Hopferau. Credit: ETH Zurich
It was while Zuse was in Hopferau that a ‘gentleman from Zurich’ visited him,
starting the chain of events that led to Z4’s delivery to ETH.82 The visitor was
Eduard Stiefel, founder of ETH’s Institute for Applied Mathematics. Stiefel wanted
a computer for his Institute and heard about Z4.83 Both Stiefel and his assistant
Heinz Rutishauser had recently visited the US and were familiar with Aiken’s
78
Zuse interviewed by Merzbach; Zuse, Der Computer – Mein Lebenswerk, p. 96.
79
Zuse interviewed by Merzbach.
80
Zuse, ‘Konrad Zuse Biographie’, p. 2.
81
Zuse interviewed by Merzbach.
82
Zuse interviewed by Merzbach.
83
Bruderer, H. Konrad Zuse und die Schweiz. Wer hat den Computer erfunden? [Konrad Zuse and
Switzerland. Who invented the computer?] (Munich: Oldenbourg, 2012), p. 5.
56 B.J. Copeland and G. Sommaruga
work.84 They knew the worth of the large German electromechanical computer
which the vagaries of war had delivered almost to their doorstep. ETH offered Zuse
a rental agreement.
Shortly after Stiefel’s visit, in 1949, Zuse moved to the small town
of Neukirchen, about 50 km north of Dusseldorf, and there founded Zuse
Kommanditgesellschaft (Zuse KG), increasing his staff to five.85 Zuse KG
would supply Europe with small, relatively cheap computers. Zuse’s first task
at Neukirchen was to restore and enlarge Z4 for ETH. He related that a second
tape reader (or ‘scanner’) was attached, enabling numbers as well as programs
to be fed in on punched tape, and circuitry was added for conditional branching
(none of Z1–Z3 had been equipped with conditional branching).86 He said the
storage unit was enlarged from 16 cells to 64.87 Rented by ETH from July 1950
until April 1955, Z4 was the first large-scale computer to go into regular operation
in Continental Europe; and Stiefel’s Institute for Applied Mathematics became a
leading centre for scientific and industrial calculation. Despite assorted problems
with the relays, Z4 was reliable enough to ‘let it work through the night unattended’,
Zuse remembered.88
Now that Z4 had a home, Zuse moved on to Z5.89 The German company Leitz,
manufacturer of Leica cameras, needed a computer for optical calculations, and
commissioned Z5. According to Petzold, the computer cost 200,000 Deutschmarks
(about US$650,000 in today’s terms) and was six times faster than Z4.90 Next
came Z11, a small relay-based wired-program computer that Zuse developed for
the Géodésie company, again used mainly for optical calculations.91 About 50 Z11s
were built.92 Applications included surveying and pension calculations.93
In 1957, Zuse moved his growing company to Bad Hersfeld, 50 km south of
Kassel, and the following year embarked on Z22, his first vacuum tube computer.94
Zuse had come round to tubes just as they were becoming outmoded for computer
use—MIT’s TX-0 transistorized computer first worked in 1956. Nevertheless,
84
Zuse interviewed by Merzbach.
85
Zuse interviewed by Merzbach.
86
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in Germany’,
p. 616.
87
Zuse interviewed by Merzbach.
88
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 619. Urs Hochstrasser, one
of the leading users of Z4 at ETH, gave an account of the problems associated with Z4’s relays;
see Bruderer, Konrad Zuse und die Schweiz. Wer hat den Computer erfunden?, pp. 19–27.
89
Zuse interviewed by Merzbach.
90
Petzold, Moderne Rechenkünstler, p. 216.
91
Zuse interviewed by Merzbach.
92
Zuse interviewed by Merzbach.
93
Petzold, Moderne Rechenkünstler, pp. 216–217.
94
Zuse, ‘Konrad Zuse Biographie’, p. 2.
The Stored-Program Universal Computer 57
Zuse KG’s Bad Hersfeld factory turned out more than 50 of the low-priced Z22
computers.95 A transistorized version, Z23, went on the market in 1961.96 Other
electronic computers followed, the Z25 and Z64.97 Oddly, Petzold says that with
‘the step to electronic technology, Zuse KG also made the step to modifiable stored
programs and thus to the von Neumann concept’.98 As we explain, this concept is
hardly von Neumann’s, and in any case Zuse himself wrote of storing programs as
early as 1936.
According to Horst Zuse, Zuse KG produced a total of 250 computers, with
a value of more than 100 million Deutschmarks.99 In 1964, however, Zuse and
his wife relinquished ownership of the company.100 By that time, despite Zuse
KG’s rapid growth, the company was overburdened by debt, and the Zuses put
their shares on the market. German engineering company Brown Boveri purchased
95
‘Zuse Computers’, Computer History Museum, www.computerhistory.org/revolution/early-
computer-companies/5/108.
96
Zuse, Der Computer – Mein Lebenswerk, p. 125; Bruderer, Konrad Zuse und die Schweiz. Wer
hat den Computer erfunden?, p. 63.
97
Zuse, Der Computer – Mein Lebenswerk, pp. 126, 131–132.
98
Petzold, Moderne Rechenkünstler, p. 217.
99
Zuse, ‘Konrad Zuse Biographie’, p. 2.
100
Zuse, Der Computer – Mein Lebenswerk, p. 137.
58 B.J. Copeland and G. Sommaruga
Zuse KG, with Zuse staying on as a consultant. Another sale in 1967 saw Siemens
AG owning Brown Boveri.101 Following further sales, a distant successor of
Zuse’s former company still exists today on Bad Hersfeld’s Konrad-Zuse-Strasse,
ElectronicNetwork GmbH, a contract electronics manufacturer.
As Sect. 1 mentioned, Zuse applied for a number of patents on his early computer
designs (his most important patent applications, discussed in detail in Sects. 5 and 6,
were in 1936 and 1941). However, the German authorities never granted Zuse a
patent. During the war, he said, ‘nothing much’ happened regarding his patent, and
then in the postwar years ‘nothing whatever happened’: his application ‘lay around,
gathering dust in a drawer of the patent office for years’.102 When things finally
did get moving, his efforts to patent his inventions came to the attention of IBM.
Zuse explained that IBM worked through another company, Triumph Corporation,
‘who lodged the protest’.103 A ‘serious legal battle’ followed, Zuse said, and things
dragged on until 1967, when the German federal patent court finally and irrevocably
declined a patent.104 The problem, according to the judge, was the patent’s lack of
Erfindungshöhe, literally ‘invention height’. As Zuse explained matters, the judge
stated that ‘the requisite invention value has not been attained’.105
Konrad Zuse died in Huhnfeld on 18 December 1995.
This section outlines the early history of the stored-program concept in the UK and
the US, and compares and contrasts Turing’s and John von Neumann’s contributions
to the development of the concept.106 Although von Neumann is routinely said
to be the originator of the stored-program concept, we find no evidence in favour
of this common view. Turing described fundamental aspects of the concept in his
1936 article ‘On Computable Numbers’, which von Neumann had read before
the war. When von Neumann arrived at the University of Pennsylvania’s Moore
101
Zuse, H. ‘Historical Zuse-Computer Z23’, 1999, www.computerhistory.org/projects/zuse_z23/
index.shtml.
102
Zuse interviewed by Merzbach.
103
Zuse interviewed by Merzbach.
104
Zuse interviewed by Merzbach; Zuse, Der Computer – Mein Lebenswerk, pp. 97–100. See
also Petzold, H. Die Ermittlung des ‘Standes der Technik’ und der ‘Erfindungshöhe’ beim
Patentverfahren Z391. Dokumentation nach den Zuse-Papieren [Establishing the ‘state of the
technological art’ and ‘inventiveness’ in patent application Z391. Documentation from the Zuse
papers] (Bonn: Selbstverlag, 1981).
105
Zuse interviewed by Merzbach.
106
Von Neumann was an alumnus of ETH Zurich, graduating as a chemical engineer in October
1926.
The Stored-Program Universal Computer 59
Turing’s 1936 paper ‘On Computable Numbers’ is the birthplace of the funda-
mental logical principles of the modern computer, and in particular the two closely
107
Cambridge University Reporter, 18 April 1935, p. 826.
108
Notes taken by Yorick Smythies during Newman’s Foundations of Mathematics lectures in 1934
(St John’s College Library, Cambridge).
109
Turing, A. M. ‘On Computable Numbers, with an Application to the Entscheidungsproblem’, in
Copeland, B. J. (ed.) The Essential Turing: Seminal Writings in Computing, Logic, Philosophy,
Artificial Intelligence, and Artificial Life (Oxford: Oxford University Press, 2004), p. 84. ‘On
Computable Numbers’ was published in 1936 but in the secondary literature the date of publication
is often given as 1937, e.g. by Andrew Hodges in his biography Alan Turing: The Enigma (London:
Vintage, 1992). The circumstances of publication of ‘On Computable Numbers’ are described on
p. 5 of The Essential Turing.
110
Newman interviewed by Christopher Evans (‘The Pioneers of Computing: An Oral History of
Computing’, London: Science Museum); quoted in The Essential Turing, p. 206 (transcription by
Copeland).
60 B.J. Copeland and G. Sommaruga
related logical ideas on which modern computing is based. We call these the ‘twin
pillars’. They are the concepts of (1): a universal computing machine, that operates
by means of (2): a program of instructions stored in the computer’s memory in the
same form as data.111 If different programs are placed on the memory-tape of the
universal Turing machine, the machine will carry out different computations. Turing
proved that the universal machine could obey any and every ‘table of instructions’—
any and every program expressed in the programming code introduced in his
1936 paper. His machine was universal in the sense that it could carry out every
mechanical (or ‘effective’) procedure, if appropriately programmed.
The stored-program universal Turing machine led ultimately to today’s archetyp-
ical electronic digital computer: the single slab of hardware, of fixed structure, that
makes use of internally stored instructions in order to become a word-processor, or
desk calculator, or chess opponent, or photo editor—or any other machine that we
have the skill to create in the form of a program. Since these electronic machines
necessarily have limited memories (unlike the universal Turing machine, with its
indefinitely extendible tape), each is what Turing called ‘a universal machine with a
given storage capacity’.112
Turing’s universal machine has changed the world. Yet nowadays, when nearly
everyone owns a physical realization of one, his idea of a universal computer is
apt to seem as obvious as the wheel and the arch. Nevertheless, in 1936, when
engineers thought in terms of building different machines for different purposes,
Turing’s vision of a universal machine was revolutionary.
Zuse also briefly outlined a computing machine that would make use of programs
stored in memory, in a few handwritten pages and a sequence of diagrams contained
in a 1938 notebook, two years after Turing gave his extensive and detailed treatment
of the idea. There is, however, no evidence that Zuse also formulated the concept of
111
The first historians to insist that the stored-program concept originated in Turing’s 1936 paper
were (so far as is known) Brian Carpenter and Bob Doran, in a classic article that is one of
New Zealand’s earliest and greatest contributions to the history of computing: Carpenter, B. E.,
Doran, R. W. ‘The Other Turing Machine’, The Computer Journal, vol. 20 (1977), pp. 269–
279. They said: ‘It is reasonable to view the universal Turing machine as being programmed
by the description of the machine it simulates; since this description is written on the memory
tape of the universal machine, the latter is an abstract stored program computer’ (p. 270). In
the United States, Martin Davis has been advocating powerfully for the same claim since 1987;
see Davis, M. D. ‘Mathematical Logic and the Origin of Modern Computers’, in Herken, R.
(ed.) The Universal Turing Machine: A Half-Century Survey (Oxford: Oxford University Press,
1988); and Davis, M. D. Engines of Logic: Mathematicians and the Origin of the Computer
(New York: Norton, 2000). However, the proposition that the stored-program concept originated
in ‘On Computable Numbers’ is far from being a historians’ reconstruction: as the present
chapter explains, this was common knowledge among Turing’s post-war colleagues at the National
Physical Laboratory, and it was obvious to members of Max Newman’s wartime group at Bletchley
Park that digital electronics could be used to implement practical forms of Turing’s universal
machine of 1936.
112
Turing, A. M. ‘Intelligent Machinery’, in The Essential Turing, p. 422.
The Stored-Program Universal Computer 61
113
Babbage, C. Passages from the Life of a Philosopher, vol. 11 of Campbell-Kelly, M. (ed.) The
Works of Charles Babbage (London: William Pickering, 1989), p. 97.
114
Zuse interviewed by Merzbach.
115
Rojas, R. ‘How to Make Zuse’s Z3 a Universal Computer’, IEEE Annals of the History of
Computing, vol. 20 (1998), pp. 51–54.
116
Historian Thomas Haigh, in his impassioned outburst ‘Actually, Turing Did Not Invent the
Computer’ (Communications of the ACM, vol. 57 (2014), pp. 36–41), confuses the logical
distinction between, on the one hand, the universal machine concept and, on the other, the concept
‘of a single machine that could do different jobs when fed different instructions’. Talking about this
second concept, Haigh objects that it was not Turing but Babbage who ‘had that idea long before’
(pp. 40–41). Babbage did indeed have that idea; the point, however, is that although Babbage had
the concept of a general-purpose computing machine, the universal machine concept originated
with Turing. (All this is explained in Copeland’s ‘Turing and Babbage’ in The Essential Turing,
pp. 27–30.)
117
Gandy, R. ‘The Confluence of Ideas in 1936’, in Herken, R. (ed.) The Universal Turing Machine:
A Half-Century Survey (Oxford: Oxford University Press, 1998), p. 90. Emphasis added.
118
Newman interviewed by Evans; Newman, M. H. A. ‘Dr. A. M. Turing’, The Times, 16 June
1954, p. 10.
62 B.J. Copeland and G. Sommaruga
Relays, he thought, would not be adequate.119 So, for the next few years, Turing’s
revolutionary ideas existed only on paper. A crucial moment came in 1944, when
he set eyes on Flowers’ racks of high-speed electronic code-cracking equipment,
at Bletchley Park. Colossus was neither stored-program nor general-purpose, but it
was clear to Turing (and to Newman) that the technology Flowers was pioneering,
large-scale digital electronics, was the way to build a miraculously fast universal
computer, the task to which Turing turned in 1945. Meanwhile, Zuse had pressed
ahead with relays and had built a general-purpose computer, but neither knew of the
work of the other at that time.
Did Zuse and Turing meet postwar? Probably not. Zuse said (in 1992) that
he had no knowledge of Turing’s ‘On Computable Numbers’ until 1948.120 This
recollection of Zuse’s, if correct, makes it unlikely that he and Turing met the
previous year at a colloquium in Gottingen, as German computer pioneer Heinz
Billing reported in his memoirs.121 A more likely connection is Turing’s colleague
from the National Physical Laboratory (NPL) Donald Davies, who interrogated
Zuse in England.122 Zuse was invited to London in 1948 and placed in a large
house in Hampstead, where a number of British computer experts arrived to question
him.123 Zuse remembered it as a ‘very nice trip’.124 Quite likely Davies—who,
along with Turing’s other colleagues in NPL’s ACE section, saw ‘On Computable
Numbers’ as containing the ‘key idea on which every stored-program machine was
based’—would have mentioned Turing’s paper to Zuse.125 Davies recollected that
the interview did not go particularly well: Zuse eventually ‘got pretty cross’, and
things ‘degenerated into a glowering match’. Zuse was ‘quite convinced’, Davies
said, that he could make a smallish relay machine ‘which would be the equal of any
of the electronic calculators we were developing’.
Although Turing completed his design for an electronic stored-program com-
puter in 1945, another four years elapsed before the first universal Turing machine
in electronic hardware ran the first stored program, on Monday 21 June 1948. It
was the first day of the modern computer age. Based on Turing’s ideas, and almost
119
Robin Gandy interviewed by Copeland, October 1995.
120
Zuse in conversation with Brian Carpenter at CERN on 17 June 1992; Copeland is grateful to
Carpenter for sending him some brief notes on the conversation that Carpenter made at the time.
See also Carpenter, B. E. Network Geeks: How They Built the Internet (London: Springer, 2013),
p. 22.
121
Jänike, J., Genser, F. (eds) Ein Leben zwischen Forschung und Praxis—Heinz Billing [A
Life Between Research and Practice—Heinz Billing] (Dusseldorf: Selbstverlag Friedrich Genser,
1997), p. 84; Bruderer, Konrad Zuse und die Schweiz. Wer hat den Computer erfunden?, pp. 64–66.
122
Davies interviewed by Christopher Evans in 1975 (‘The Pioneers of Computing: An Oral
History of Computing’, London: Science Museum; © Board of Trustees of the Science Museum).
123
Zuse, Der Computer – Mein Lebenswerk, p. 101; Zuse interviewed by Evans; Davies inter-
viewed by Evans.
124
Zuse interviewed by Evans.
125
Davies interviewed by Evans.
The Stored-Program Universal Computer 63
big enough to fill a room, this distant ancestor of our mainframes, laptops, tablets
and phones was called ‘Baby’.126 Baby was built by radar engineers F. C. Williams
and Tom Kilburn, in Newman’s Computing Machine Laboratory at the University
of Manchester, in the north of England.127
However, historians of the computer have often found Turing’s contributions
hard to place, and many histories of computing written during the six decades since
his death sadly do not so much as mention him. Even today there is still no real
consensus on Turing’s place in computing history. In 2013, an opinion piece by
the editor of the Association for Computing Machinery’s flagship journal objected
to the claim that Turing invented the stored-program concept. The article’s author,
Moshe Vardi, dismissed the claim as ‘simply ahistorical’.128 Vardi emphasized that
it was not Turing but the Hungarian-American mathematician John von Neumann
who, in 1945, ‘offered the first explicit exposition of the stored-program computer’.
This is true, but the point does not support Vardi’s charge of historical inaccuracy.
Although von Neumann did write the first paper explaining how to convert Turing’s
ideas into electronic form, the fundamental conception of the stored-program
universal computer was nevertheless Turing’s.
Von Neumann was close to the centre of the American effort to build an elec-
tronic stored-program universal computer. He had read Turing’s ‘On Computable
Numbers’ before the war,129 and when he became acquainted with the U.S. Army’s
ENIAC project in 1944, he discovered that the stored-program concept could be
applied to electronic computation.130 ENIAC was designed by Presper Eckert and
John Mauchly at the Moore School of Electrical Engineering (part of the University
of Pennsylvania), in order to calculate the complicated tables needed by gunners to
aim artillery, and the computer first ran in 1945. Like Colossus before it, ENIAC was
programmed by means of re-routing cables and setting switches, a process that could
take as long as three weeks.131 Viewed from the modern stored-program world,
this conception of programming seems unbearably primitive. In the taxonomy of
programming paradigms developed in Sect. 4, this method of programming is P1,
the most rudimentary level in the taxonomy.
126
Tootill, G. C. ‘Digital Computer—Notes on Design & Operation’, 1948–9 (National Archive
for the History of Computing, University of Manchester).
127
For more about Baby, see Copeland, B. J. ‘The Manchester Computer: A Revised History’, IEEE
Annals of the History of Computing, vol. 33 (2011), pp. 4–37; Copeland, B. J. Turing, Pioneer of
the Information Age (Oxford: Oxford University Press, 2012, 2015), ch. 9.
128
Vardi, M. Y. ‘Who Begat Computing?’, Communications of the ACM, vol. 56 (Jan. 2013), p. 5.
129
Stanislaw Ulam interviewed by Christopher Evans in 1976 (‘The Pioneers of Computing: an
Oral History of Computing’, Science Museum: London).
130
Goldstine, H. The Computer from Pascal to von Neumann (Princeton: Princeton University
Press, 1972), p. 182.
131
Campbell-Kelly, M. ‘The ACE and the Shaping of British Computing’, in Copeland, B. J. et al.
Alan Turing’s Electronic Brain: The Struggle to Build the ACE, the World’s Fastest Computer
(Oxford: Oxford University Press, 2012; a revised and retitled paperback edition of the 2005
hardback Alan Turing’s Automatic Computing Engine), p. 151.
64 B.J. Copeland and G. Sommaruga
Conscious of the need for a better method of programming, the brilliant engineer
Eckert had the idea of storing instructions in the form of numbers as early as 1944,
inventing a high-speed recirculating memory.132 This was based on apparatus he had
previously used for echo cancellation in radar, the mercury delay line. Instructions
and data could be stored uniformly in the mercury-filled tube, in the form of
pulses—binary digits—that were ‘remembered’ for as long as was necessary. This
provided the means to engineer the stored-program concept, and mercury delay lines
were widely employed as the memory medium of early computers—although in fact
the first functioning electronic stored-program computer used not delay lines but an
alternative form of memory, the Williams tube. Based on the cathode ray tube, the
Williams tube was invented by Williams and further developed by Kilburn.
Along with Zuse, Eckert has a strong claim to be regarded as a co-originator
of the stored-program paradigm that in Sect. 4 is denoted ‘P3’. Eckert said that
the stored-program concept was his ‘best computer idea’—although, of course,
he arrived at the idea approximately eight years after the publication of Turing’s
‘On Computable Numbers’.133 Endorsing Eckert’s claim, Mauchly commented that
they were discussing ‘storing programs in the same storage used for computer data’
several months before von Neumann first visited their ENIAC group.134 Art Burks,
a leading member of the ENIAC group and later one of von Neumann’s principal
collaborators at the Princeton computer project, also explained that—long before
von Neumann first visited them—Eckert and Mauchly were ‘saying that they would
build a mercury memory large enough to store the program for a problem as well as
the arithmetic data’.135
Maurice Wilkes, who visited the Moore School group in 1946 and who went on
to build the Cambridge EDSAC delay-line computer (see the timeline in Fig. 3),
gave this first hand account of the roles of Eckert, Mauchly and von Neumann:
Eckert and Mauchly appreciated that the main problem was one of storage, and they
proposed : : : ultrasonic delay lines. Instructions and numbers would be mixed in the same
memory. : : : Von Neumann : : : appreciated at once : : : the potentialities implicit in the
stored program principle. That von Neumann should bring his great prestige and influence
to bear was important, since the new ideas were too revolutionary for some, and powerful
voices were being raised to say that : : : to mix instructions and numbers in the same
memory was going against nature.136
132
Eckert, J. P. ‘The ENIAC’, in Metropolis, Howlett and Rota, A History of Computing in the
Twentieth Century, p. 531.
133
Eckert, ‘The ENIAC’, p. 531.
134
Mauchly, J., commenting in Eckert, ‘The ENIAC’, pp. 531–532.
135
Letter from Burks to Copeland, 16 August 2003.
136
Wilkes, M. V. 1967 ACM Turing Lecture: ‘Computers Then and Now’, Journal of the
Association for Computing Machinery, vol. 15 (1968), pp. 1–7 (p. 2).
The Stored-Program Universal Computer 65
delay line store, with enough capacity to store program information as well as data. Von
Neumann created the first modern order code and worked out the logical design of an
electronic computer to execute it.137
Von Neumann, then, did not originate the stored-program concept, but con-
tributed significantly to its development, both by championing it in the face of
conservative criticism, and, even more importantly, by designing an appropriate
instruction code for stored programming. As Tom Kilburn said, ‘You can’t start
building until you have got an instruction code’.138
During the winter of 1944 and spring of 1945, von Neumann, Eckert and
Mauchly held a series of weekly meetings, working out the details of how to design
a stored-program electronic computer.139 Their proposed computer was called the
EDVAC. In effect they designed a universal Turing machine in hardware, with
instructions stored in the form of numbers, and common processes reading the data
and reading and executing the instructions. While there are no diary entries to prove
the point beyond any shadow of doubt, nor statements in surviving letters written by
von Neumann at this precise time, his appreciation of the great potentialities inherent
in the stored-program concept could hardly fail to have been influenced by his
knowledge of Turing’s ‘On Computable Numbers’, nor by his intimate knowledge
of Kurt Gödel’s 1931 demonstration that logical and arithmetical sentences can be
expressed as numbers.140 Von Neumann went on to inform electronic engineers at
large about the stored-program concept.
In his 1945 document titled ‘First Draft of a Report on the EDVAC’, von
Neumann set out, in rather general terms, the design of an electronic stored-program
computer.141 However, shortly after this appeared in mid 1945, his collaboration
with Eckert and Mauchly came to an abrupt end, with the result that the ill-fated
EDVAC was not completed until 1952.142 Trouble arose because von Neumann’s
colleague Herman Goldstine had circulated a draft of the report before Eckert’s and
137
Burks, A. W. ‘From ENIAC to the Stored-Program Computer: Two Revolutions in Computers’,
in Metropolis, Howlett, and Rota, A History of Computing in the Twentieth Century, p. 312.
138
Kilburn interviewed by Copeland, July 1997.
139
Von Neumann, J., Deposition before a public notary, New Jersey, 8 May 1947; Warren, S. R.
‘Notes on the Preparation of “First Draft of a Report on the EDVAC” by John von Neumann’,
2 April 1947. Copeland is grateful to Harry Huskey for supplying him with copies of these
documents.
140
Gödel, K, ‘Uber formal unentscheidbare Sätze der Principia Mathematica und verwandter
Systeme I.’ [On formally undecidable propositions of Principia Mathematica and related systems
I], Monatshefte für Mathematik und Physik, vol. 38 (1931), pp. 173–198.
141
Von Neumann, J. ‘First Draft of a Report on the EDVAC’, Moore School of Electrical
Engineering, University of Pennsylvania, 1945; reprinted in full in Stern, N. From ENIAC to
UNIVAC: An Appraisal of the Eckert-Mauchly Computers (Bedford, Mass.: Digital Press, 1981).
142
Huskey, H. D. ‘The Development of Automatic Computing’, in Proceedings of the First USA-
JAPAN Computer Conference, Tokyo, 1972, p. 702.
66 B.J. Copeland and G. Sommaruga
Mauchly’s names were added to the title page.143 Bearing von Neumann’s name
alone, the report was soon widely read. Eckert and Mauchly were furious but von
Neumann was unrepentant.
‘My personal opinion’, von Neumann said defiantly in 1947, ‘was at all times,
and is now, that this [the distribution of the report] was perfectly proper and in the
best interests of the United States’.144 Widespread dissemination of the report had,
he said, furthered ‘the development of the art of building high speed computers’.
Perhaps he was hinting that Eckert and Mauchly would have opposed widespread
distribution of the report. It would be perfectly understandable if they had, since
the report’s entering the public domain effectively prevented them from patenting
their ideas. Eckert later wrote bitterly of ‘von Neumann’s way of taking credit for
the work of others’.145 Jean Jennings, one of ENIAC’s programmers and a member
of the ENIAC group from early 1945, noted that von Neumann ‘ever afterward
accepted credit—falsely—for the work of the Moore School group. : : : [He] never
made an effort to dispel the general acclaim in the years that followed’.146
After a dispute with the University of Pennsylvania about intellectual prop-
erty rights, Eckert and Mauchly formed their own Electronic Control Company,
and began work on their EDVAC-like BINAC. Meanwhile, von Neumann drew
together a group of engineers at the Institute for Advanced Study in Princeton.
He primed them by giving them Turing’s ‘On Computable Numbers’ to read.147
Julian Bigelow, von Neumann’s chief engineer, was well aware of the influence
that ‘On Computable Numbers’ had had on von Neumann. The reason that von
Neumann was the ‘person who really : : : pushed the whole field ahead’, Bigelow
explained, was because ‘he understood a good deal of the mathematical logic which
was implied by the [stored program] idea, due to the work of A. M. Turing : : :
in 1936’.148 ‘Turing’s machine does not sound much like a modern computer
today, but nevertheless it was’, Bigelow said—’It was the germinal idea’. The
physical embodiment of Turing’s universal computing machine that von Neumann’s
engineers built at Princeton began working in 1951. Known simply as the ‘Princeton
143
See e.g. Stern, N. ‘John von Neumann’s Influence on Electronic Digital Computing, 1944–
1946’, Annals of the History of Computing, vol. 2 (1980), pp. 349–362.
144
Von Neumann, Deposition, 8 May 1947.
145
Eckert, ‘The ENIAC’, p. 534.
146
Jennings Bartik, J. Pioneer Programmer: Jean Jennings Bartik and the Computer that Changed
the World (Kirksville, Missouri: Truman State University Press: 2013), pp. 16, 18.
147
Letter from Julian Bigelow to Copeland, 12 April 2002; see also Aspray, W. John von Neumann
and the Origins of Modern Computing (Cambridge, Mass.: MIT Press, 1990), p. 178.
148
Bigelow in a tape-recorded interview made in 1971 by the Smithsonian Institution and released
in 2002; Copeland is grateful to Bigelow for previously sending him a transcript of excerpts from
the interview.
The Stored-Program Universal Computer 67
computer’, it was not the first of the new stored-program electronic computers, but
it was the most influential.149
Although Turing is not mentioned explicitly in von Neumann’s papers devel-
oping the design for the Princeton computer, von Neumann’s collaborator Burks
told Copeland that his, von Neumann’s, and Goldstine’s key 1946 design paper did
contain a reference to Turing’s 1936 work.150 Von Neumann and his co-authors
emphasized that ‘formal-logical’ work—by which they meant in particular Turing’s
1936 investigation—had shown ‘in abstracto’ that stored programs can ‘control
and cause the execution’ of any sequence (no matter how complex) of mechanical
operations that is ‘conceivable by the problem planner’.151
Meanwhile, in 1945, Turing joined London’s National Physical Laboratory, to
design an electronic universal stored-program computer. John Womersley, head of
NPL’s newly formed Mathematics Division, was responsible for recruiting him.
Womersley had read ‘On Computable Numbers’ shortly after it was published, and
at the time had considered building a relay-based version of Turing’s universal
computing machine. As early as 1944 Womersley was advocating the potential
of electronic computing.152 He named NPL’s projected electronic computer the
Automatic Computing Engine, or ACE—a deliberate echo of Babbage.
Turing studied ‘First Draft of a Report on the EDVAC’, but favoured a radically
different type of design. He sacrificed everything to speed, launching a 1940s
version of what is today called RISC (Reduced Instruction Set Computing).153 In
order to maximise the speed of the machine, Turing opted for a decentralised archi-
tecture, whereas von Neumann described a centralised design that foreshadowed
the modern central processing unit (cpu).154 Turing associated different arithmetical
and logical functions with different delay lines in the ACE’s Eckert-type mercury
memory, rather than following von Neumann’s model of a single central unit in
149
The Princeton computer is described in Bigelow, J. ‘Computer Development at the Institute for
Advanced Study’, in Metropolis, Howlett, Rota, A History of Computing in the Twentieth Century.
150
Letter from Arthur Burks to Copeland, 22 April 1998.
151
Burks, A. W., Goldstine, H. H., von Neumann, J. ‘Preliminary Discussion of the Logical Design
of an Electronic Computing Instrument’, Institute for Advanced Study, 28 June 1946, in vol. 5 of
Taub, A. H. ed. Collected Works of John von Neumann (Oxford: Pergamon Press, 1961), section
3.1 (p. 37).
152
See Copeland, B. J. ‘The Origins and Development of the ACE Project’, in Copeland et al., Alan
Turing’s Electronic Brain.
153
Doran, R. W. ‘Computer Architecture and the ACE Computers’, in Copeland et al., Alan
Turing’s Electronic Brain.
154
The terms ‘decentralised’ and its opposite ‘centralised’ are due to Jack Good, who used them
in a letter to Newman about computer architecture on 8 August 1948; the letter is in Good, I. J.
‘Early Notes on Electronic Computers’ (unpublished, compiled in 1972 and 1976; a copy is in the
National Archive for the History of Computing, University of Manchester, MUC/Series 2/a4), pp.
63–4.
68 B.J. Copeland and G. Sommaruga
which all the arithmetical and logical operations take place.155 Turing was (as his
colleague James Wilkinson observed156) ‘obsessed’ with making the computations
run as fast as possible, and once a pilot version of the ACE was operational, it could
multiply at roughly 20 times the speed of its closest competitor.157
The pilot model of Turing’s Automatic Computing Engine, in the Mathematics Division of
London’s National Physical Laboratory. Credit: National Physical Laboratory © Crown copyright
155
For additional detail concerning the differences between Turing’s decentralized architecture
and the centralized architecture favoured by von Neumann, see Copeland et al., Alan Turing’s
Electronic Brain; and Copeland, ‘The Manchester Computer: A Revised History’.
156
Wilkinson interviewed by Christopher Evans in 1976 (‘The Pioneers of Computing: An Oral
History of Computing’, London: Science Museum).
157
See the table by Martin Campbell-Kelly on p. 161 of Copeland et al., Alan Turing’s Electronic
Brain.
158
Turing, A. M. ‘Proposed Electronic Calculator’, ch. 20 of Copeland et al., Alan Turing’s
Electronic Brain.
159
Letter from Huskey to Copeland, 4 February 2002.
The Stored-Program Universal Computer 69
gave detailed specifications of the various hardware units, and even included sample
programs in machine code.
Turing was content to borrow some of the elementary design ideas in von
Neumann’s report (and also the notation, due originally to McCulloch and Pitts, that
von Neumann used to represent logic gates—a notation that Turing considerably
extended in ‘Proposed Electronic Calculator’). One example of a borrowing is
Turing’s diagram of an adder, essentially the same as von Neumann’s diagram.160
This borrowing of relatively pedestrian details is probably what Turing was referring
to when he told a newspaper reporter in 1946 that he gave ‘credit for the donkey
work on the A.C.E. to Americans’.161 Yet, the similarities between Turing’s design
and the von Neumann-Eckert-Mauchly proposals are relatively minor in comparison
to the striking differences.
In their 1945 documents ‘Proposed Electronic Calculator’ and ‘First Draft of a
Report on the EDVAC’, Turing and von Neumann both considerably fleshed out
the stored-program concept, turning it from the bare-bones logical idea of Turing’s
1936 paper into a fully articulated, electronically implementable design concept.
The 1945 stored-program concept included:
• dividing stored information into ‘words’ (the term is used by Eckert and Mauchly
in a September 1945 progress report on the EDVAC162 )
• using binary numbers as addresses of sources and destinations in memory
• building arbitrarily complex stored programs from a small stock of primitive
expressions (as in ‘On Computable Numbers’).
Each document set out the basis for a very different practical version of the
universal Turing machine (and the von Neumann design was indebted to extensive
input from Eckert and Mauchly). Each document also replaced Turing’s pioneering
programming code of 1936 with a form of code more appropriate for high-speed
computing. Again, each presented a very different species of code, von Neumann
favouring instructions composed of operation codes followed by addresses, while
Turing did not use operation codes: the operations to be performed were implied by
the source and destination addresses.
Turing pursued the implications of the stored-program idea much further than
von Neumann did at that time. As has often been remarked, in the ‘First Draft’
von Neumann blocked the wholesale modification of instructions by prefixing them
160
Compare Fig. 10 of Turing’s ‘Proposed Electronic Calculator’ (on p. 431 of Copeland et al.,
Alan Turing’s Electronic Brain) with Fig. 3 of von Neumann’s report (on p. 198 of Stern, From
ENIAC to UNIVAC).
161
Evening News, 23 December 1946. The cutting is among a number kept by Sara Turing and now
in the Modern Archive Centre, King’s College, Cambridge (catalogue reference K 5).
162
Eckert, J. P., Mauchly, J. W. ‘Automatic High Speed Computing: A Progress Report on the
EDVAC’, Moore School of Electrical Engineering, Sept. 1945. http://archive.computerhistory.org/
resources/text/Knuth_Don_X4100/PDF_index/k-8-pdf/k-8-u2736-Report-EDVAC.pdf. Copeland
is grateful to Bob Doran for pointing out this early occurrence of the term ‘word’ (in correspon-
dence).
70 B.J. Copeland and G. Sommaruga
with a special tag. Only the address bits could be modified. Carpenter and Doran
pointed out in their classic 1977 paper that, because von Neumann ‘gave each word
a nonoverrideable tag, he could not manipulate instructions’, and they emphasized
that it was Turing, and not von Neumann, who introduced ‘what we now regard as
one of the fundamental characteristics of the von Neumann machine’.163
The manipulation of instructions as if they were numbers was fundamental to the
computer design that Turing put forward in ‘Proposed Electronic Calculator’. He
described program storage in editable memory as giving ‘the machine the possibility
of constructing its own orders’.164 His treatment of conditional branching involved
performing arithmetical operations on instructions considered as numbers (e.g.
multiplying an instruction by a given digit).165 Carpenter and Doran emphasized,
‘Von Neumann does not take this step’ (the step of manipulating instructions as if
they were numbers) in the ‘First Draft’.166 The idea that instructions and data are
common coin was taken for granted by ACE’s programmers at the NPL. Sometimes
instructions were even used as numerical constants, if an instruction considered as
a number happened to equate to the value required.167
Furthermore, Turing recognized in ‘Proposed Electronic Calculator’ that a
program could manipulate other programs.168 As Carpenter and Doran again say,
‘The notion of a program that manipulates another program was truly spectacular
in 1945’.169 Zuse had similar ideas in 1945, envisaging what he called a ‘Planferti-
gungsgerät’ [plan producing machine], a ‘special computer to make the program for
a numerical sequence controlled computer’.170 He added: ‘This device was intended
to do about the same as sophisticated compilers do today’.171 Zuse discussed
automated programming in his 1945 manuscript ‘Der Plankalkül’, describing this
process as ‘calculating calculating plans’.172 Turing even envisaged programs that
are able to rewrite their own instructions in response to experience. ‘One can
imagine’, he said in a lecture on the ACE, ‘that after the machine had been operating
for some time, the instructions would have altered out of all recognition’.173
163
Carpenter and Doran, ‘The Other Turing Machine’, p. 270; see also Carpenter and Doran,
‘Turing’s Zeitgeist’.
164
Turing, ‘Proposed Electronic Calculator’, p. 382.
165
Turing, ‘Proposed Electronic Calculator’, pp. 382–383.
166
Carpenter and Doran, ‘The Other Turing Machine’, p. 270.
167
Vickers, T. ‘Applications of the Pilot ACE and the DEUCE’, in Copeland et al., Alan Turing’s
Electronic Brain, p. 277.
168
Turing, ‘Proposed Electronic Calculator’, p. 386.
169
Carpenter and Doran, ‘Turing’s Zeitgeist’.
170
Zuse, ‘Some Remarks on the History of Computing in Germany’, pp. 616–617.
171
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 617.
172
Zuse, ‘Der Plankalkül’ (manuscript), pp. 30–31.
173
Turing, A. M. ‘Lecture on the Automatic Computing Engine’, in The Essential Turing, p. 393.
The Stored-Program Universal Computer 71
The two 1945 documents by Turing and von Neumann each had a very different
philosophy. Essentially, von Neumann’s ‘First Draft’ presented a novel form of
numerical calculator. Right at the beginning of ‘First Draft’, in the course of what
he called ‘some general explanatory remarks’, he offered this ‘definition’ of his
subject matter: ‘An automatic computing system is a (usually highly composite)
device, which can carry out instructions to perform calculations of a considerable
order of complexity’.174 The EDVAC, he explained, would be a ‘very high speed’
automatic digital calculator. Turing, on the other hand, was envisaging a different
kind of beast. For instance, he listed in ‘Proposed Electronic Calculator’ an
assortment of non-numerical problems suitable for the ACE. These included solving
a jig-saw, a problem that he described as ‘typical of a very large class of non-
numerical problems that can be treated’, adding: ‘Some of these have great military
importance, and others are of immense interest to mathematicians.’175 By this time
Turing already had significant experience with non-numerical computation: his
Bombe was designed to solve a specific type of non-numerical problem.176 Turing
also mentioned chess in ‘Proposed Electronic Calculator’, making his famous
remark that ‘There are indications : : : that it is possible to make the machine display
intelligence at the risk of its making occasional serious mistakes’.177 Despite its
modest title, ‘Proposed Electronic Calculator’ offered far more than a numerical
calculator.
In January 1947 Turing travelled to the United States, to attend the Harvard
Symposium on Large-Scale Digital Calculating Machinery. Organized by Aiken at
his Harvard Computation Laboratory, this was the world’s second sizable computing
conference, with more than 300 delegates attending (a smaller conference with
around 85 delegates was held at MIT in fall 1945).178 With the birth of the
stored-program electronic computer expected imminently, the time was ripe for
a collection of visionary lectures; but Aiken, the leading champion in the US of
electromechanical program-controlled computation, did not seize the moment. With
the exception of Mauchly, Goldstine, and Jay Forrester—who at MIT was planning
the Whirlwind I computer, one of the first stored-program machines to run (see
the timeline in Fig. 3)—none of the leading pioneers of the new stored-program
technology lectured at the symposium, not even the physically present Turing.
His contributions were confined to astute comments during the brief post-lecture
174
Von Neumann, ‘First Draft of a Report on the EDVAC’, p. 181 in Stern, From ENIAC to
UNIVAC.
175
Turing, ‘Proposed Electronic Calculator’, pp. 388–9.
176
Turing, A. M. ‘Bombe and Spider’, in The Essential Turing.
177
Turing, ‘Proposed Electronic Calculator’, p. 389.
178
‘Members of the Symposium’, Proceedings of a Symposium on Large-Scale Digital Calculating
Machinery. Jointly Sponsored by The Navy Department Bureau of Ordnance and Harvard
University at The Computation Laboratory 7–10 January 1947. Vol. 16 of The Annals of the
Computation Laboratory of Harvard University (Cambridge, MA: Harvard University Press,
1948), pp. xvii–xxix.
72 B.J. Copeland and G. Sommaruga
discussions, including the following succinct expression of what we now see as the
central feature of universal computers:
We [at the National Physical Laboratory] are trying to make greater use of the facilities
available in the machine to do all kinds of different things simply by programming : : : This
is an application of the general principle that any particular operation of physical apparatus
can be reproduced : : : simply by putting in more programming.179
179
‘Sheppard Discussion’, Proceedings of a Symposium on Large-Scale Digital Calculating
Machinery, p. 273.
180
Numerico, T. ‘From Turing Machine to “Electronic Brain”’, in Copeland et al., Alan Turing’s
Electronic Brain, p. 182.
181
Letter from von Neumann to Norbert Wiener, 29 November 1946; in the von Neumann Archive
at the Library of Congress, Washington, D.C. (quoted on p. 209 of The Essential Turing).
182
‘Rigorous Theories of Control and Information’, in von Neumann, J. Theory of Self-
Reproducing Automata (Urbana: University of Illinois Press, 1966; ed. Burks A. W.), p. 50.
183
Letter from Frankel to Brian Randell, 1972 (first published in Randell, B. ‘On Alan Turing
and the Origins of Digital Computers’, in Meltzer, B., Michie, D. (eds) Machine Intelligence 7
(Edinburgh: Edinburgh University Press, 1972)). Copeland is grateful to Randell for giving him a
copy of this letter.
184
Hurd, C., Comments on Eckert, ‘The ENIAC’, in Metropolis, Howlett, and Rota, A History of
Computing in the Twentieth Century, p. 536.
The Stored-Program Universal Computer 73
Returning to Moshe Vardi’s efforts to refute the claim that Turing originated
the stored-program concept, Vardi states—defending von Neumann’s corner—that
‘we should not confuse a mathematical idea with an engineering design’. So at
best Turing deserves the credit for an abstract mathematical idea? Not so fast.
Vardi is ignoring the fact that some inventions do belong equally to the realms of
mathematics and engineering. The universal Turing machine of 1936 was one such,
and this is part of its brilliance.
What Turing described in 1936 was not an abstract mathematical notion but a
solid three-dimensional machine (containing, as he said, wheels, levers, and paper
tape185 ); and the cardinal problem in electronic computing’s pioneering years, taken
on by both ‘Proposed Electronic Calculator’ and the ‘First Draft’, was just this: How
best to build a practical electronic form of the universal Turing machine?
The claim that in 1936 Turing came up merely with an abstract mathematical
idea, and moreover without perceiving any connection between it and potential
real computing machinery, is a persistent one. Notoriously, ‘Proposed Electronic
Calculator’ did not so much as mention the universal machine of 1936, leading some
commentators to wonder whether even in Turing’s mind there was any connection
between the ACE and his earlier abstract machine (a doubt forcefully expressed
by George Davis, a pioneer of computing from the pilot ACE era).186 Computer
historian Martin Campbell-Kelly also doubted that the universal Turing machine
was a ‘direct ancestor of the ACE’, pointing out that the memory arrangements of
the 1936 machine and of the ACE were very different, with the ACE’s ‘addressable
memory of fixed-length binary numbers’ having ‘no equivalent in the Turing
Machine’.187
However, some fragments of an early draft of ‘Proposed Electronic Calculator’
cast much new light on this issue.188 The fragments survive only because Turing
used the typed sheets as scrap paper, covering the reverse sides with rough notes on
circuit design; his assistant, Mike Woodger, happened to keep the rough notes. In
these fragments, Turing explicitly related the ACE to the universal Turing machine,
explaining why the memory arrangement described in his 1936 paper required
modification when creating a practical design for a computer. He wrote:
In ‘Computable numbers’ it was assumed that all the stored material was arranged linearly,
so that in effect the accessibility time was directly proportional to the amount of material
stored, being essentially the digit time multiplied by the number of digits stored. This was
185
Turing, A. M., draft précis (in French) of ‘On Computable Numbers’ (undated, 2 pp.; in the
Turing Papers, Modern Archive Centre, King’s College Library, Cambridge, catalogue reference
K 4).
186
George Davis, verbal comments at the ACE 2000 Conference, National Physical Laboratory,
Teddington, 2000; and also at a seminar on Turing organised by the British Computer Conservation
Society, Science Museum, London, 2005.
187
Campbell-Kelly, ‘The ACE and the Shaping of British Computing’, pp. 156–157.
188
These fragments were published for the first time as Turing, A. M. ‘Notes on Memory’, in
Copeland et al., Alan Turing’s Automatic Computing Engine, Oxford: Oxford University Press,
2005).
74 B.J. Copeland and G. Sommaruga
the essential reason why the arrangement in ‘Computable numbers’ could not be taken over
as it stood to give a practical form of machine. Actually we can make the digits much more
accessible than this, but there are two limiting considerations to the accessibility which is
possible, of which we describe one in this paragraph. If we have N digits stored then we shall
need about log2N digits to describe the place in which a particular digit is stored. This will
mean to say that the time required to put in a request for a particular digit will be essentially
log2N digit time. This may be reduced by using several wires for the transmission of a
request, but this might perhaps be considered as effectively decreasing the digit time.189
Arguments that the ACE cannot have been inspired by the universal machine
of 1936, since Turing did not mention his 1936 machine in ‘Proposed Electronic
Calculator’, are plainly non-starters. It must also be remembered that the NPL hired
Turing for the ACE project precisely because Womersley was familiar with, and had
been inspired by, ‘On Computable Numbers’.
Historian Thomas Haigh, who, like Vardi, is fighting in von Neumann’s corner,
weighs in on the side of the sceptics, attempting to raise doubt about whether
‘Turing was interested in building an actual computer in 1936’.190 He tries to
undermine Max Newman’s testimony on this point (almost as though he were
von Neumann’s lawyer), writing that the information ‘is sourced not to any diary
entry or letter from the 1930s but to the recollections of one of Turing’s former
lecturers made long after real computers had been built’.191 Haigh fails to inform his
readers that this former lecturer was none other than Newman, who (as explained
above) played a key role in the genesis of the universal Turing machine, and
for that matter also in the development of the first universal Turing machine in
electronic hardware (to the point of securing the transfer, from Bletchley Park to
the Manchester Computing Laboratory, of a truckload of electronic and mechanical
components from dismantled Colossi).192 Even in the midst of the attack on the
German codes, Newman was thinking about the universal Turing machine: when
Flowers was designing Colossus in 1943, Newman showed him Turing’s 1936
paper, with its key idea of storing symbolically-encoded instructions in memory.193
Donald Michie, a member of Newman’s wartime section at Bletchley Park, the
‘Newmanry’—home to nine Colossi by 1945—recollected that, in 1944–45, the
Newmanry’s mathematicians were ‘fully aware of the prospects for implementing
physical embodiments of the UTM [universal Turing machine] using vacuum-tube
technology’.194
In fact, Newman’s testimony about the foregoing point is rather detailed. He
explained in a tape-recorded interview that when he learned of Turing’s universal
189
Turing, ‘Notes on Memory’, p. 456.
190
Haigh, ‘Actually, Turing Did Not Invent the Computer’, p. 39.
191
Haigh, ‘Actually, Turing Did Not Invent the Computer’, p. 39.
192
Copeland et al., Colossus, p. 172; for a detailed account of Newman’s role in the Manchester
computer project, see Copeland, ‘The Manchester Computer: A Revised History’.
193
Flowers in interview with Copeland, July 1996.
194
Letter from Michie to Copeland, 14 July 1995.
The Stored-Program Universal Computer 75
195
Newman interviewed by Evans.
196
Newman, M. H. A. ‘Dr. A. M. Turing’, The Times, 16 June 1954, p. 10.
197
Davies interviewed by Evans.
198
Darwin, C. ‘Automatic Computing Engine (ACE)’, National Physical Laboratory, 17 April 1946
(National Archives, document reference DSIR 10/385); published in Copeland et al., Alan Turing’s
Automatic Computing Engine, pp. 53–57.
199
Turing, ‘Lecture on the Automatic Computing Engine’, pp. 378, 383.
76 B.J. Copeland and G. Sommaruga
200
Haigh, T. ‘“Stored Program Concept” Considered Harmful: History and Historiography’, in
Bonizzoni, P., Brattka, V., Löwe, B. (eds) The Nature of Computation. Logic, Algorithms, Applica-
tions (Berlin: Springer, 2013), p. 247. See also Haigh, T., Priestley, M., Rope, C. ‘Reconsidering
the Stored-Program Concept’, IEEE Annals of the History of Computing, vol. 36 (2014), pp. 4–17.
201
Haigh, ‘“Stored Program Concept” Considered Harmful’, pp. 243–244.
202
Haigh, ‘“Stored Program Concept” Considered Harmful’, p. 244.
203
Haigh, ‘“Stored Program Concept” Considered Harmful’, pp. 245, 247; Haigh, Priestley and
Rope, ‘Reconsidering the Stored-Program Concept’, p. 12.
204
Haigh, Priestley and Rope, ‘Reconsidering the Stored-Program Concept’, pp. 4, 14, 15.
205
Haigh, ‘“Stored Program Concept” Considered Harmful’, pp. 247, 249; Haigh, Priestley and
Rope, ‘Reconsidering the Stored-Program Concept’, p. 12.
206
Von Neumann, ‘First Draft of a Report on the EDVAC’, Sections 14–15 (pp. 236 ff in Stern,
From ENIAC to UNIVAC).
The Stored-Program Universal Computer 77
although on a much smaller scale), Turing explained that ‘the machine will
incorporate a large “Memory” for the storage of both data and instructions’.207 In
an early homage to the joys of stored programming, he remarked enthusiastically
that the ‘process of constructing instruction tables should be very fascinating’,
continuing: ‘There need be no real danger of it ever becoming a drudge, for any
processes that are quite mechanical may be turned over to the machine itself’.208
Others followed Turing’s usage. In their famous 1948 letter to Nature, announc-
ing the birth of the Manchester Baby, Williams and Kilburn explained that the
‘instruction table’ was held in the computer’s ‘store’.209 In the same letter they also
used the term ‘programme of instructions’, and emphasized that ‘the programme can
be changed without any mechanical or electro-mechanical circuit changes’. Their
selection of the terms ‘store’ and ‘programme’ proved to be a way of speaking that
many others would also find natural, and by 1953, usage was sufficiently settled for
Willis Ware (in a discussion of ENIAC and von Neumann’s subsequent Princeton
computer project) to be able to write simply: ‘what we now know as the “stored
program machine”’.210 As for the centrality of the stored-program concept, in their
writings from the period Williams and Kilburn repeatedly highlighted the concept’s
key position. For example, Kilburn said in 1949: ‘When a new instruction is required
from the table of instructions stored in the main storage tube, S, a “prepulse” initiates
the standard sequence of events’.211
Similar examples can be multiplied endlessly from books and articles published
on both sides of the Atlantic. That electronic computers could usefully edit their
own stored programs was basic common knowledge. The 1959 textbook Electronic
Digital Computers said:
[I]t is at once apparent that instructions may be operated upon by circuitry of the same
character as that used in processing numerical information. Thus, as the computation
progresses, the machine may be caused to modify certain instructions in the code that it
is following.212
We believe that the useful and well-known term ‘stored program’ is reasonably
clear and precise. Like many terms, however, it will certainly benefit from some
careful logical analysis, and this we now offer. Looking back over the early years
of the computer’s history, as outlined in Sect. 3, at least six different programming
207
‘The Turing-Wilkinson Lecture Series (1946–7)’, in Copeland et al., Alan Turing’s Electronic
Brain, p. 465.
208
Turing, ‘Proposed Electronic Calculator’, p. 392.
209
Williams, F. C., Kilburn, T. ‘Electronic Digital Computers’, Nature, vol. 162, no. 4117 (1948),
p. 487.
210
Ware, W. H. ‘The History and Development of the Electronic Computer Project at the Institute
for Advanced Study’, RAND Corporation report P-377, Santa Monica, 10 March 1953, p. 5.
211
Kilburn, T. ‘The Manchester University Digital Computing Machine’, in Williams, M. R.,
Campbell-Kelly, M. eds The Early British Computer Conferences (Los Angeles: Tomash, 1989),
p. 138.
212
Smith, C. V. L. Electronic Digital Computers (New York: McGraw Hill, 1959), p. 31.
78 B.J. Copeland and G. Sommaruga
paradigms can be distinguished. We denote these paradigms P1, P2, P3, P4, P5 and
P6. To borrow Turing’s onion-skin metaphor, the stored-program concept, like an
onion, consists of a number of layers or levels. P3, P4, P5 and P6 are four such
layers.
4.1 Paradigm P1
4.2 Paradigm P2
The main advantage of P2 over P1 is the ease of setting up for a new job and the
corresponding reduction in setup time. Instructions are expressed in the form of
numbers, or other symbols, and these numbers or symbols are stored in a memory
medium such as tape or punched cards. The processes used in writing and reading
the instructions are not of the same kind as those used in executing them (an echo
of Gandy’s statement, quoted above). Exemplars of this paradigm are Babbage’s
Analytical Engine, whose instructions were pre-punched into cards, and Aiken’s
ASCC and Zuse’s Z1, Z2, Z3 and Z4, whose instructions were pre-punched into
tape. This form of programming is read-only and the computer does not edit the
instructions as the program runs.
We shall call machines programmed in accordance with P1 or P2 ‘program-
controlled’ machines, in order to set off P1 and P2 from P3–P6.
213
Good, I. J., Michie, D., Timms, G. General Report on Tunny, Bletchley Park, 1945 (National
Archives/Public Record Office, Kew; document reference HW 25/4 (vol. 1), HW 25/5 (vol. 2)),
p. 331. A digital facsimile of General Report on Tunny is available in The Turing Archive for the
History of Computing <http://www.AlanTuring.net/tunny_report>.
214
Eckert, ‘The ENIAC’, p. 531.
The Stored-Program Universal Computer 79
4.3 Paradigm P3
Instructions are expressed in the form of numbers, or other symbols, and these
numbers or symbols are stored in a (relatively fast) read-only memory. As with
P2, the operations used in executing the instructions are not available to edit
the instructions, since the memory is read-only. Writing the instructions into the
memory may be done by hand—e.g. by means of a keyboard or hand-operated
setting switches. The main advantage of P3 over P2 is that with this form of memory,
unlike tape or cards, instructions or blocks of instructions can be read out repeatedly,
without any need to create multiple tokens of the instructions in the memory (so
avoiding copying blocks of cards or punching blocks of instructions over and again
on the tape).
An exemplar of this paradigm is the modified form of ENIAC here called
‘ENIAC-1948’. In order to simplify the setup process, ENIAC was operated from
1948 with instructions stored in a ‘function table’, a large rack of switches mounted
on a trolley.215 (Eckert recorded that this hand-switched, read-only storage system,
essentially a resistance network, was based on ideas he learned from RCA’s Jan
Rajchman.216 ) The switch trolleys offered a slow but workable read-only memory
for coded instructions. Richard Clippinger from Aberdeen Ballistic Research
Laboratory (where ENIAC was transferred at the end of 1946) was responsible for
ENIAC’s transition to this new mode of operation. Clippinger explained:
I discovered a new way to program the ENIAC which would make it a lot more
convenient. : : : I became aware of the fact that one could get a pulse out of the function
table, and put it on the program trays, and use it to stimulate an action. This led me to invent
a way of storing instructions in the function table.217
It seems, though, that Clippinger had reinvented the wheel. Mauchly stated that
Eckert and he had previously worked out this idea.218 Referring to Clippinger’s
rediscovery of the idea, Mauchly said: ‘[P]eople have subsequently claimed that
the idea of such stored programs were [sic] quite novel to others who were at
Aberdeen’. Eckert emphasized the same point: ‘In Aberdeen, Dr. Clippinger later
“rediscovered” these uses of the function tables’.219 However, it can at least be said
that Clippinger reduced the idea to practice, with the assistance of Nick Metropolis,
215
Jennings said that ENIAC ran in this mode from April 1948, but Goldstine reported a later date:
‘on 16 September 1948 the new system ran on the ENIAC’. Jennings Bartik, Pioneer Programmer,
p. 120; Goldstine, The Computer from Pascal to von Neumann, p. 233.
216
Presper Eckert interviewed by Christopher Evans in 1975 (‘The Pioneers of Computing: an Oral
History of Computing’, Science Museum: London).
217
Richard Clippinger interviewed by Richard R. Mertz in 1970 (Computer Oral History
Collection, Archives Center, National Museum of American History, Smithsonian Institution,
Washington, D.C.), p. I-I-11.
218
John Mauchly interviewed by Christopher Evans in 1976 (‘The Pioneers of Computing: an Oral
History of Computing’, Science Museum: London).
219
Eckert, ‘The Eniac’, p. 529.
80 B.J. Copeland and G. Sommaruga
Betty Jean Jennings, Adele Goldstine, and Klari von Neumann (von Neumann’s
wife).220
In the secondary literature, this idea of using the switch trolleys to store
instructions is usually credited to von Neumann himself (e.g. by Haigh in his
recent von Neumann-boosting work).221 But Clippinger related that ‘When Adele
Goldstine noted that I had evolved a new way to program the ENIAC, she quickly
passed the word along to von Neumann’.222 According to Clippinger and Jennings,
what von Neumann contributed, in the course of discussions with Clippinger and
others, was a more efficient format for the stored instructions, a one-address code
that allowed two instructions to be stored per line in the function table (replacing
Clippinger’s previous idea of using a three-address code).223
There is a strong tradition in the literature for calling this paradigm ‘stored
program’, and we follow that tradition here. Nick Metropolis and Jack Worlton
said that ENIAC-1948 was ‘the first computer to operate with a read-only stored
program’.224 Mauchly also referred to the switch trolley arrangement as involving
‘stored programs’ (in the above quotation). In passing, we note that we would
not object very strongly to the suggestions that P3 be termed ‘stored-program in
the weak sense’ or ‘stored-program in the minimal sense’, or even as transitional
between P2 and genuine stored programming. However, the important point is that
the major difference between P3 and P4–P6 should be marked somehow; and so
long as the distinction is clearly enough drawn, it hardly matters in the end which
words are used. Later in this section, a systematic notation is developed that brings
out the minimal and somewhat anomalous status of P3.
4.4 Paradigm P4
220
Clippinger interviewed by Mertz, p. I-I-14; Metropolis, N., Worlton, J. ‘A Trilogy on Errors in
the History of Computing’, Annals of the History of Computing, vol. 2 (1980), pp. 49–59 (p. 54).
221
Haigh, ‘“Stored Program Concept” Considered Harmful’, p. 242. Also Goldstine, The Computer
from Pascal to von Neumann, p. 233.
222
Clippinger interviewed by Mertz, p. I-I-12. See also Metropolis and Worlton, ‘A Trilogy on
Errors in the History of Computing’, p. 54.
223
Clippinger interviewed by Mertz, pp. I-I-12, I-I-13; Jennings Bartik, Pioneer Programmer, pp.
11–12, 113.
224
Metropolis and Worlton, ‘A Trilogy on Errors in the History of Computing’, p. 54.
The Stored-Program Universal Computer 81
4.5 Paradigm P5
4.6 Paradigm P6
P6 is but a very short step away from P5. In P5, the editing of instructions is limited
to the insertion, manipulation and deletion of symbols that function as markers,
while in P6 the editing processes from time to time delete and insert symbols in
such a way as to produce a different instruction. P6 first appeared in the historical
record in 1945. ‘Proposed Electronic Calculator’ and ‘First Draft of a Report on
the EDVAC’ both describe this programming paradigm (although, as noted above,
von Neumann initially protected symbols other than address bits from editing, only
lifting this restriction in later publications).
In P6, the instructions that are edited may be those of the program that is actually
running, as with address modification on the fly, or the modifications may be made
to the instructions of some other program stored in memory. These different cases
are here referred to as ‘reflexive’ and ‘non-reflexive’ editing respectively. Section 3
225
Turing, ‘On Computable Numbers’, pp. 71–72 in The Essential Turing.
82 B.J. Copeland and G. Sommaruga
noted that in 1945 both Turing and Zuse farsightedly mentioned the idea of one
program editing another. The importance of non-reflexive editing of stored programs
was in fact widely recognized from computing’s earliest days (further disproof of
Haigh’s claim that the stored program concept had only a ‘fairly obscure’ existence
until the late 1970s). In one of computing’s earliest textbooks, published in 1953,
Alec Glennie wrote:
It has been found possible to use the Manchester machine to convert programmes written in
a notation very similar to simple algebraic notation into its own code. : : : [I]nstructions take
the form of words which, when interpreted by the machine, cause the correct instructions in
the machine’s code to be synthesized. : : : 226
The key differences between P3, P4, P5 and P6 can be summarized as follows:
P3. This paradigm is: read-only stored instructions. In a notation designed to make
the differences between P3, P4, P5 and P6 transparent (‘S-notation’), P3 is S0, a
step beneath the unmarked case S.
P4. P4 is simply S: stored instructions potentially accessible to editing by the
processes used to execute the instructions. Actually making use of this potential
leads on to P5 and P6.
P5. This paradigm is: stored instructions/editing/no instruction change. P5 is S/E/D
(‘D‘ signifying no change of instruction). P5 includes editing markers in a way
that involves applying numerical operations, such as addition and subtraction, to
the markers. For example, having marked an instruction with the marker ‘1010’,
the mechanism may repeatedly read out that instruction, each time subtracting 1
from the marker, stopping this process when the marker becomes 0000.
P6. This paradigm is: stored instructions/editing/instruction change. P6 is S/E/4I
(‘4I’ signifying instruction change). Where it is helpful to distinguish between
reflexive and non-reflexive editing, S/E/4I/SELF is written for the former and
S/E/4I/OTHER for the latter. S/E is sometimes used to refer to any or all of
S/E/D, S/E/4I/SELF and S/E/4I/OTHER. For ease of reference, the S-notation
just presented is summarized in Figure 1.228
Important logical features of P2, P3, P4, P5, and P6, as well as an important
forensic principle, can be brought out by means of a short discussion of a memory
model we call ‘Eckert disk memory’. Eckert considered this form of internal
226
Glennie, A. E. ‘Programming for High-Speed Digital Calculating Machines’, ch. 5 of Bowden,
B. V. ed. Faster Than Thought (London: Sir Isaac Pitman & Sons, 1953), pp. 112–113.
227
Smith, Electronic Digital Computers, p. 31.
228
Copeland developed the S-notation in discussion with Diane Proudfoot.
The Stored-Program Universal Computer 83
S/E Editing of the stored instructions actually occurs in any of the following
modes:
S/E/= As in Turing's 1936 paper, the stored instructions are edited, by the addition or
removal of marker symbols, but the instructions are not changed ('='
signifying no change of instruction).
S/E/DI The stored instructions are changed during editing ('DI ' signifying
instruction change). S/E/DI divides into two types:
S/E/D I/OTHER The program alters the instructions of another stored program.
memory in 1944, but soon abandoned the idea in favour of his mercury delay
line, a technology he had had more experience with, and which, moreover, he
thought would offer faster access times.229 Eckert described the disk memory in
a typewritten note dated January 1944.230 He said ‘The concept of general internal
storage started in this memo’—but in fact that concept appeared in Zuse’s writings
in 1936, as Sect. 6 explains.231
Eckert disk memory consists of a single memory unit containing a number of
disks, mounted on a common electrically-driven rotating shaft (called the time
shaft). As Eckert described it, the memory unit formed part of a design for a
desk calculating machine, an improved and partly electronic form of ‘an ordinary
mechanical calculating machine’, he said.232 Some of the disks would have their
edges engraved with programming information expressed in a binary code. As the
disk rotated, the engravings would induce pulses in a coil mounted near the disk.
These pulses would initiate and control the operations required in the calculation
(addition, subtraction, multiplication, division). Eckert described this arrangement
as ‘similar to the tone generating mechanism used in some electric organs’.233
The engraved disks offered permanent storage. Other disks, made of magnetic
alloy, offered volatile storage. The edges of these disks were to be ‘capable of
being magnetized and demagnetized repeatedly and at high speed’.234 Pulses would
229
Eckert interviewed by Evans.
230
The note is included in Eckert, ‘The ENIAC’, pp. 537–539.
231
Eckert, ‘The ENIAC’, p. 531.
232
Eckert, ‘The ENIAC’, p. 537.
233
Eckert, ‘The ENIAC’, p. 537.
234
Eckert, ‘The ENIAC’, p. 537.
84 B.J. Copeland and G. Sommaruga
be written to the disk edges and subsequently read. (This idea of storing pulses
magnetically on disks originated with Perry Crawford.235) Eckert explained that
the magnetic disks were to be used to store not only numbers but also function
tables, such as sine tables and multiplication tables, and the numerical combinations
required for carrying out binary-decimal-binary conversion.
A 1945 report written by Eckert and Mauchly, titled ‘Automatic High Speed
Computing: A Progress Report on the EDVAC’, mentioned this 1944 disk memory.
Eckert and Mauchly said: ‘An important feature of this device was that operating
instructions and function tables would be stored in exactly the same sort of memory
device as that used for numbers’.236 This statement is true, but requires some careful
unpacking.
The storage of programming information on the engraved disks is an exem-
plification of paradigm P2. As with punched tape, the processes used in writing
the engraved instructions (etching) and reading them (induction) are manifestly
of a different kind from the processes used in executing the instructions. The
processes used in execution are incapable of editing the engraved instructions. It
is true that instructions are stored in the same memory device as numbers—i.e.
the disk memory unit itself—but this device has subcomponents and, as we shall
see, the subcomponents conform to different programming paradigms. Conceivably
Eckert’s intention was to control the rotation of the engraved disk in such a way that
instructions or blocks of instructions could be read out repeatedly, in which case the
engraved disk would conform to paradigm P3 rather than P2; however, Eckert did
not mention this possibility in the memo.
Eckert also considered using a magnetic alloy disk for instruction storage, saying
‘programming may be of the temporary type set up on alloy discs or of the
permanent type on etched discs’.237 Storing the programming information on an
alloy disk exemplifies paradigm P4, since the same read and write operations that
are used to fetch and store numbers during execution are also used to write the
binary instructions onto the disk and to read them for execution.
When the instructions are stored on one of these magnetic alloy disks, the
possibility arises of editing the instructions. For example, spaces could be left
between instructions where temporary markers could be written to indicate the
next instruction (P5); or, indeed, the instructions themselves might be edited, to
produce different instructions (P6). However, Eckert mentioned none of this; the
idea of editing instructions was simply absent from his memo. Of course, it is a
short step conceptually from simply storing the instructions on one of these alloy
disks to editing them. Nevertheless, it would be a gross mistake to say that Eckert’s
memo entertains either S/E/D or S/E/4I, since there is nothing in the document to
indicate that editing of instructions was envisaged. Eckert might have been aware
235
Eckert interviewed by Evans.
236
Eckert and Mauchly, ‘Automatic High Speed Computing: A Progress Report on the EDVAC’,
p. 2.
237
Eckert, ‘The ENIAC’, p. 538.
The Stored-Program Universal Computer 85
In 1948, in a lecture to the Royal Society of London, Max Newman defined general
purpose computers as ‘machines able without modification to carry out any of a
wide variety of computing jobs’.238 Zuse had undoubtedly conceived the idea of a
digital, binary, program-controlled general-purpose computer by 1936. In a patent
application dating from April of that year he wrote:
The present invention serves the purpose of automatically carrying out frequently recurring
calculations, of arbitrary length and arbitrary construction, by means of calculating
machinery, these calculations being composed of elementary calculating operations. : : : A
prerequisite for each kind of calculation that is to be done is the preparation of a calculating
plan : : : The calculating plan is recorded in a form that is suitable for the control of
the individual devices, e.g. on a punched paper tape. The calculating plan is scanned by
the machine, section by section, and provides the following details for each calculating
operation: the identifying numbers of the storage cells containing the operands; the basic
type of calculation; the identifying numbers of the cell storing the result. The calculating
plan’s fine detail [Angaben] automatically triggers the necessary operations.239
238
Newman, M. H. A. ‘A Discussion on Computing Machines’, Proceedings of the Royal Society
of London, Series A, vol. 195 (1948), pp. 265–287 (p. 265).
239
Zuse, Patent Application Z23139, April 1936, pp. 1–2.
240
Zuse, Patent Application Z23139, April 1936, p. 8.
86 B.J. Copeland and G. Sommaruga
Is there any evidence that Zuse went further than this in his thinking, to reach the
point of formulating the concept of a universal computer, independently of Turing?
A patent application dating from 1941 contained a passage that might be taken
to suggest an affirmative answer. Our view, though, is that the correct answer is
negative. In the 1941 application, Zuse first explained that
New [to the art] is the combination of elements in such a way that orders are given to the
whole system from a scanner : : : The calculating unit A is connected with the storage
unit C so that the calculating unit’s results can be transferred to any arbitrary cell of the
storage unit, and also stored numbers can be transferred to the individual organs [Organe]
of the calculating unit.241 P is the plan unit together with the scanner. It is from here that
the calculating unit’s operating keys are controlled, as well as the selection unit, Pb, which
connects up the requisite storage cells with the calculating unit.242
This statement might be thought to parallel the Church-Turing thesis that every
effective calculation can be done by the universal machine, and so to embody an
independent notion of universality.244
We argue that this is not so. Zuse’s 1936 and 1941 patent applications described
an automatic numerical calculator—a very advanced form of relay-based desk cal-
culator, programmable and capable of floating-point calculations, and yet compact
enough and cheap enough to be stationed permanently beside the user, probably an
engineer. Zuse’s machine was, in a sense, a personal computer. Rojas said: ‘As early
as 1935 or so, Zuse started thinking about programmable mechanical calculators
specially designed for engineers. His vision at the time, and in the ensuing years,
was not of a large and bulky supercomputer but of a desktop calculating machine.’245
Zuse’s desktop calculator added, subtracted, multiplied, and divided. There is no
trace in Zuse’s 1936 and 1941 descriptions of his machine of Turing’s grand vision
of a universal machine able not only to calculate, but also to solve non-numerical
problems, to learn, and to reproduce the behaviour of a wide range of different forms
of physical apparatus. Not until his work on the Plankalkül did Zuse consider ‘steps
in the direction of symbolic calculations, general programs for relations, or graphs,
as we call it today, chess playing, and so on’.246
241
Von Neumann also spoke of ‘specialized organs’ for addition, multiplication, and so on; von
Neumann, ‘First Draft of a Report on the EDVAC’, p. 182 in Stern, From ENIAC to UNIVAC.
242
Zuse, Patent Application Z391, 1941, p. 4.
243
Zuse, Patent Application Z391, 1941, p. 4.
244
Copeland, B. J. ‘The Church-Turing Thesis’, in Zalta, E. (ed.) The Stanford Encyclopedia of
Philosophy, http://plato.stanford.edu/entries/church-turing/.
245
Rojas, R., Darius, F., Göktekin, C., Heyne, G. ‘The Reconstruction of Konrad Zuse’s Z3’, IEEE
Annals of the History of Computing, vol. 27 (2005), pp. 23–32 (p. 23).
246
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 625.
The Stored-Program Universal Computer 87
247
Zuse, Patent Application Z391, 1941, p. 9.
248
Zuse, Patent Application Z23624, December 1936.
249
Zuse, Patent Application Z23139, April 1936, p. 12. Square rooting is not mentioned in Z23139
but is dealt with in Z391.
250
Zuse interviewed by Merzbach; Zuse, ‘Some Remarks on the History of Computing in
Germany’, p. 614; and Zuse makes the same claim in his interview with Evans.
251
Zuse interviewed by Merzbach.
252
Rojas, ‘How to Make Zuse’s Z3 a Universal Computer’, p. 53.
88 B.J. Copeland and G. Sommaruga
253
Rojas, ‘How to Make Zuse’s Z3 a Universal Computer’, p. 51.
254
Rojas, ‘How to Make Zuse’s Z3 a Universal Computer’, p. 53.
255
Wells, B. ‘A Universal Turing Machine Can Run on a Cluster of Colossi’, American Mathemat-
ical Society Abstracts, vol. 25 (2004), p. 441.
256
Wells, B. ‘Unwinding Performance and Power on Colossus, an Unconventional Computer’,
Natural Computing, vol. 10 (2011), pp. 1383–1405 (p. 1402); Hewitt, A. ‘Universal Computation
With Only 6 Rules’, http://forum.wolframscience.com/showthread.php?threadid=1432.
257
Harcke, L. J. ‘Number Cards and the Analytical Engine’, manuscript (Copeland is grateful to
Wells for sending him a copy of this unpublished proof). Wells found a lacuna in Harcke’s proof
but he believes this to be harmless; Wells says ‘A memory management limitation can be overcome
by seamlessly incorporating virtual memory, as Harcke agrees’.
258
Turing, A. M. ‘Computing Machinery and Intelligence’, p. 455 in The Essential Turing.
259
Minsky, M. L. Computation: Finite and Infinite Machines (Englewood Cliffs: Prentice-Hall,
1967), p. 258.
The Stored-Program Universal Computer 89
the light of Minsky’s theorem, it would have been rather curious had Z3 turned out
not to be universal.
It is an undeniable feature of all these universality proofs for early machines
that the proofs tell us nothing at all about these ancient computers as they actually
existed and were actually used. Despite Wells’ proof that the Colossi were universal,
Flowers’ actual machines were very narrowly task-specific. Jack Good related
that Colossus could not even be coaxed to carry out long multiplication. This
extreme narrowness was no defect of Colossus: long multiplication was simply not
needed for the cryptanalytical processing that Colossus was designed to do. Wells’
result, then, teaches us a general lesson: even the seemingly most unlikely devices
can sometimes be proved to be universal, notwithstanding the actual historical
limitations of the devices.
Similar remarks apply to Rojas’s result about Zuse’s machine. His proof tells us
nothing at all about the machine as it was used, and viewed, within its own historical
context, and nothing at all about its scope and limits as a practical computer.
Nevertheless, these results are certainly not without interest. As Wells put it:
Colossus was the first functioning electronic universal machine. The Analytical Engine,
Colossus, and Z3 were all universal. This has nothing to do with the intentions or writings
of Babbage, Flowers, or Zuse—it is an objective property of their creations.
260
Zuse, Patent Application Z23139, April 1936, p. 4.
261
Zuse, Patent Application Z391, 1941, p. 5.
262
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 618.
263
Zuse, Patent Application Z391, 1941, p. 40.
90 B.J. Copeland and G. Sommaruga
264
Zuse, Patent Application Z23139, April 1936, p. 5.
265
Bromley, A. ‘Charles Babbage’s Analytical Engine, 1838’, Annals of the History of Computing,
vol. 4 (1982), pp. 196–217.
266
Zuse, Patent Application Z391, 1941, p. 3.
The Stored-Program Universal Computer 91
The calculating plan can be stored too, whereby the orders are fed to the control devices in
synchrony with the calculation.
Correspondingly, calculating plans can be stored in a fixed form if the machine is to carry
out the same calculation often.267
Beyond those two brief sentences, noting the possibility of storing the calculating
plan, Zuse said nothing more about the matter in his patent application. In particular,
he did not say where or how the plan would be stored. If storage was to be in
some unit logically equivalent to Eckert’s engraved disks, then the device that
Zuse was describing still conforms to paradigm P2. If, however, he meant that
the calculating plan, expressed in binary code, was to be placed in the addressable
relay store, together with the initial values, and whatever numbers were transferred
to the store from the calculating unit as the calculation progressed—and it seems
reasonable enough to interpret him in this way, since he mentions no other kind
of storage apart from the addressable store and the punched tape—then Zuse can
reasonably be taken to be suggesting S0 programming, albeit very briefly. The
1938 documents examined later in this section tend to support this interpretation.
If Zuse was indeed thinking of S0 programming, however, there is little in the 1936
document to indicate that his thinking went beyond S0 at this time (we discuss his
‘two connections between the storage unit and calculating unit’ below). In particular
there is no evidence to suggest that the further steps involved in S/E were in his mind.
Schmidhuber’s claim that in 1936, Zuse described ‘a “von Neumann architec-
ture” : : : with program and data in modifiable storage’ is immensely misleading.
What Zuse described in 1936, in great detail, was a P2 architecture. In two brief
sentences he mentioned in passing the possibility of storing the program, but gave
no architectural detail whatsoever. Moreover, far from offering further development
in his 1941 patent application of this idea of storing the program, it is not even
mentioned there. ‘The calculating plan has the form of punched paper tape’, Zuse
stated in 1941.268
It is not so surprising that in his 1941 design Zuse did not pursue his idea of
placing binary coded instructions in the relay store, nor implement the idea in Z3
or Z4. Any speed differential between the relay-based calculating unit and the tape
mechanism was not so great as to create an instruction bottleneck, and the internal
storage of instructions would use up precious cells of the relay store.
Nevertheless, Zuse did return to the program storage idea: half a dozen hand-
written pages in a 1938 workbook extended his cryptic suggestion of 1936. The
entries are dated 4 and 6 June and are a mixture of labeled diagrams and notes in
the Stolze-Schrey shorthand system. Zuse’s shorthand was transcribed into ordinary
German by the Gesellschaft für Mathematik und Datenverarbeitung (Society for
Mathematics and Data Processing) during 1977–1979.269 In these pages, Zuse
267
Zuse, Patent Application Z23139, April 1936, pp. 6–7.
268
Zuse, Patent Application Z391, 1941, p. 40.
269
Both workbook and transcription are in the Deutsches Museum Archiv, NL 207/01949. We are
grateful to Matthias Röschner of the Deutsches Museum for information.
92 B.J. Copeland and G. Sommaruga
introduced a special ‘plan storage unit’ [Planspeicherwerk]. This was of the same
type as the main relay store. The plan storage unit was coupled with a ‘read-out unit’
[Abfühlwerk] and these two units are shown in his diagrams as replacing the punched
tape for supplying instructions to the calculating unit. As the notes progressed, Zuse
dropped the distinction between the plan storage unit and the main relay store, using
the main store for plan storage.
Zuse’s principal focus in these notes was to develop a ‘simpler way’ of dealing
with ‘plans with iterating or periodical parts’ (as mentioned previously, if a block of
instructions punched on tape needs to be iterated, the instructions are punched over
and over again, a clumsy arrangement). In addition to the plan storage unit, Zuse
introduced a number of supplementary units to assist with program management
and control. He named programs by a ‘plan identification number’ [Plan-Nummer],
and to the plan storage unit he added a ‘setting unit’ [Einstellwerk] and a ‘plan
selection unit’ [Planwählwerk]. These three units together with the read-out unit
made up the ‘plan unit’ [Planwerk]. When a plan identification number was written
into the setting unit, the plan selection unit would progressively select the lines of
the store containing the identified plan, allowing the read-out unit to deliver the
plan’s instructions one by one.
Next Zuse introduced the idea of a numbered ‘subplan’ and of a ‘self-controlling’
calculating unit. He distinguished between what he called ‘outer orders’ [äussere
Befehle] and ‘inner orders’ [innere Befehle]. He wrote: ‘Outer orders control
the work unit [Arbeitswerk], inner orders control the order unit [Befehlswerk]’.
The work unit consists of the selection unit, the storage unit, and the operation
unit (calculating unit). Unfortunately Zuse did not explain the term ‘order unit’
[Befehlswerk], which occurs only once in Zuse’s notes and is presumably different
from the plan unit [Planwerk].
In the default case, the outer orders are delivered sequentially from the plan
storage unit. Inner orders, however, can trigger the operation of a ‘counting unit’
[Zählwerk]. There are two of these counting units, E0 for the main plan and E1 for
the subplans. Zuse’s diagram shows the plan storage unit supplying inner orders
to the counting units. These inner orders appear to be essentially parameters. The
arrival of a parameter from the plan storage unit at one of the counting units
toggles the unit into action. Zuse wrote: ‘Inner orders trigger changes in the relevant
counting unit, whereby E1 (subplan unit) carries on counting from the desired Nr
and E0 carries on counting from where it stopped last.’ While E0 is toggled on, the
plan selection unit causes the plan storage unit to stream outer orders from the main
plan to the work unit; and while E1 is toggled on, the plan storage unit streams
outer orders from the subplan. The subplan is selected by writing the subplan
identification number into the setting unit.
Finally, Zuse described an arrangement whereby the plan storage unit was able
to place the return address (the number of the instruction that will be read out of the
plan storage unit once a subplan has been completed) into a special storage unit for
instruction numbers. Zuse also explained that, by using additional counting units,
all of them controlled by parameters from the plan storage unit, multiple nesting of
subplans [Mehrfachverschachtelung] could be achieved.
The Stored-Program Universal Computer 93
Fig. 2 Transferring information [Angaben] from the ‘work unit’ to the ‘plan unit’. E is the setting
unit, Pl.W. is the plan selection unit, Pl Sp is the plan storage unit, W is the selection unit, Sp is the
storage unit, and Op is the calculating unit. Credit: Deutsches Museum (Nachlass Konrad Zuse,
Bestellnr. NL 207/0717)
270
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 616.
94 B.J. Copeland and G. Sommaruga
from the store to the work unit, and a second arrowed line leading back from the
work unit to the store.
How was the potential implicit in S—implicit in the bi-directional connection—
to be used? About this Zuse wrote tantalizingly little. All he said, in his June 6 entry
in the workbook, headed ‘Dependent and independent feedback’, was this:
Independent feedback D independent from the initial details (note: initial details [Aus-
gangsangaben] D input values [Eingabewerte]) serves only to enable the plan to be
represented in a more compact form, and for this to unfold as it runs. Plans with
independent feedback are still said to be rigid.
Dependent feedback D actual living plans. Effect [Einfluss] of the details [Angaben]
calculated [errechneten], thus also of the initial details [Ausgangsangaben] on the
sequence of events [Ablauf ] in the computation [Rechnung].
These comments are altogether too cryptic for an interpreter to be certain what
Zuse meant. We offer the following speculative example of a ‘living plan’, which
causes the machine to select a subroutine for itself, on the basis of calculations
from input values, and also to calculate from the input how many times to run the
subroutine. We select this particular example in order to make the idea of a living
plan vivid. The example derives from Turing’s work on note-playing subroutines,
as described in his Programmers’ Handbook for Manchester Electronic Computer
Mark II.271 The example conforms well to Zuse’s 1976 description of his early
ideas: ‘instructions stored independently and special units for the handling of
addresses and subroutines’.272
Consider a simple piece of software for playing musical notes. The program takes
as input values (a) a number n, and (b) the name of a musical note. In the case we will
describe, the note is C4 , middle C (the subscript indicating the octave). These input
values cause the computer to play the note of C4 , the number n functioning to tell
the computer how long to hold the note. The computer plays the note by outputting
a stream of suitable pulses (a stream of 1s separated by appropriate numbers of 0s)
to an attached amplifier and loudspeaker. The actual details of how the sound is
produced are not relevant to the example. The key point is that the note is generated
by a subroutine consisting of a loop of instructions; running the loop continuously
sends the correct stream of pulses to the loudspeaker. Each note-playing subroutine
has a subplan identification number. n is the timing number, causing the computer
to hold the specified note for n seconds.
Inputting the note name C4 (in binary, of course) causes the calculating unit to
calculate a subplan identification number from the binary input, by means of a fixed
formula. The calculating unit then feeds this identification number back to the plan
storage unit, and thence the number is transferred to the plan selection unit (by some
271
Turing, A. M. Programmers’ Handbook for Manchester Electronic Computer Mark II Com-
puting Machine Laboratory, University of Manchester, no date, circa 1950; a digital facsimile is in
The Turing Archive for the History of Computing at www.AlanTuring.net/programmers_handbook.
See also Copeland, B. J., Long, J. ‘Electronic Archaeology: Turing and the History of Computer
Music’, in Floyd, J., Bokulich, A. (eds) Philosophical Explorations of the Legacy of Alan Turing
(New York: Springer, 2016).
272
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 616.
The Stored-Program Universal Computer 95
mechanism that Zuse left unspecified). Next, the calculating unit divides n by the
time taken to obey a single outer instruction (assumed to be a constant), and feeds
this parameter m back to E1 , the counting unit for subplans. The parameter’s arrival
has the effect of toggling E1 on, with the result that the subplan whose identification
number is in the plan selection unit starts to run. After m steps, control passes back
to the main program, at which point the loudspeaker has played C4 for n seconds.
This example of P4 programming, which involves no editing of instructions,
appears to us to illustrate the principles described by Zuse in his 1938 notes.
While it cannot be ruled out that Zuse might have had in mind editing an inner
instruction in the course of the unspecified steps leading to the delivery of the
subplan identification number and the parameter m to their respective units,
he certainly did not say so. Zuse in fact gave only the vaguest description
of the inner order to E1 , saying merely that these orders were of the form:
‘“continue E1 ” Nr : : : ’. More importantly, we find no trace of evidence in these
notes that Zuse was thinking of S/E (either S/E/4I or S/E/D) in the case of outer
orders.
7 Concluding Remarks
a
Wilkes, M. V. Memoirs of a Computer Pioneer (Cambridge, Mass.: MIT, 1985), p. 142; Lukoff,
H. From Dits to Bits: A Personal History of the Electronic Computer (Portland, Oregon: Robotics
Press, 1979), p. 84; Woodger, M. ‘Stored-program Electronic Digital Computers’ (handwritten
The Stored-Program Universal Computer 97
Zuse then turned his back on the stored-program idea. His 1941 Z3 and 1945
Z4 used punched tape to control the computation and are exemplars of the P2
programming paradigm. Much later, reflecting on his early work, Zuse said he had
felt that implementing the Rück-Koppelung of his 1938 notes ‘could mean making
a contract with the devil’.273 ‘When you make feedback from the calculating unit
to the program, nobody is able to foresee what will happen’, he explained.274 Zuse
emphasized that Z1–Z4 had ‘no feedback in the program unit’.275
When Zuse began developing his Plankalkül language, however, he did return
to the stored-program concept. In 1945, the annus mirabilis of the stored-program
concept, Zuse wrote his five-chapter manuscript ‘Der Plankalkül’. Unpublished and
little-known, this document is the shadowy third member of a momentous trilogy of
reports from 1945, each developing the stored-program concept in its own way—the
others being, of course, von Neumann’s ‘First Draft of a Report on the EDVAC’ and
Turing’s ‘Proposed Electronic Calculator’.
In a section of ‘Der Plankalkül’ titled ‘Repetition plans’, Zuse said:
[T]he orders contained in the repetition part [of the plan] are subject to continuous changes
that result from the repetitions themselves.276
note in the Woodger Papers, Science Museum, Kensington, London); McCann, D., Thorne, P. The
Last of the First. CSIRAC: Australia’s First Computer (Melbourne University Press, 2000), p. 2.
273
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 616.
274
Zuse interviewed by Evans.
275
Zuse, ‘Some Remarks on the History of Computing in Germany’, p. 616.
276
Zuse, ‘Der Plankalkül’ (manuscript), p. 32.
277
Zuse, ‘Der Plankalkül’ (manuscript), p. 32.
278
Zuse, ‘Der Plankalkül’ (manuscript), pp. 23, 24, 25.
98 B.J. Copeland and G. Sommaruga
Free calculating plans, seemingly the same as or at any rate similar to the
‘living plans’ of Zuse’s 1938 workbook, are those in which ‘the actual variables
[eigentlichen Variablen] have an effect on the course of the calculation’.279 In ‘Der
Plankalkül’, Zuse described the automatic calculation of free calculating plans,
noting that this process might determine ‘only parts’ of the plan, leaving other
details to be filled in by the machine as the plan actually runs.280 The Plankalkül
and Zuse’s early ideas about compilation are topics for a more detailed discussion
in a further article.
Turing also described S/E/4I/OTHER in 1945 and, of course, Turing and von
Neumann both described S/E/4I/SELF in that year (in their respective reports
on the ACE and the EDVAC). Von Neumann restricted the scope of editing to
the instruction’s address bits, whereas Turing described unrestricted editing of
instructions. It was not until his later papers, from 1946 onwards, that unrestricted
S/E/4I/SELF appeared in von Neumann’s work.
Jumping back a year in order to mention developments at the Moore School:
in 1944 Eckert described a disk memory device involving both S0 and S, but
there is no evidence that he was considering S/E/4I or even S/E/D in connection
with this device. Around the same time, he and Mauchly conceived the idea of
storing instructions in the ENIAC’s function tables, an idea later reduced to practice
at the Ballistic Research Laboratory, by Clippinger and others, in 1948. Storing
instructions in ENIAC-1948s read-only function tables was an example of the P3
programming paradigm, achieving S0 but not S.
S and S/E/4I/SELF were first implemented at Manchester, in Max Newman’s
Computing Machine Laboratory. S was achieved in June 1948, but initially the Baby
was used without instruction editing, as a surviving laboratory notebook shows.281
During the summer, though, Kilburn added a kind of halfway house, a relative
control transfer, achieved by adding a number from the store to the address of the
next instruction (or subtracting the number from the address), this address being
stored in a control tube.282 Soon the editing of instructions was considered useful
enough for Williams and Kilburn to add special hardware for it. The new hardware
was known simply as the B-tube, the accumulator already being named the A-tube
and the control tube the C-tube. In modern terms, the B-tube was an index register.
Kilburn explained:
Instructions can, of course, be modified by the normal processes : : : in the same way as
numbers, but this is often inconvenient and wasteful of time and storage space. Therefore
each instruction : : : is preceded by a single digit called the b digit. If b D 0, the content of
279
Zuse, ‘Der Plankalkül’ (manuscript), p. 25.
280
Zuse, ‘Der Plankalkül’ (manuscript), p. 31.
281
Tootill, ‘Digital Computer—Notes on Design & Operation’.
282
Kilburn interviewed by Copeland, July 1997.
The Stored-Program Universal Computer 99
line B0 of B (normally zero) is added into the present instruction : : : before this instruction
is used. If b D 1, the content of line B1 of B is used in the same manner.283
Baby. Tom Kilburn is on the left, Freddie Williams on the right. Credit: University of Manchester
School of Computer Science
283
Kilburn, T. ‘The University of Manchester Universal High-Speed Digital Computing Machine’,
Nature, vol. 164, no. 4173 (1949), pp. 684–7 (p. 687).
284
Tootill, ‘Digital Computer—Notes on Design & Operation’, list of the machine’s instructions
dated 13/10/48. There are also two undated references to instructions involving the B-tube a few
pages earlier. Kilburn included the same B-tube instructions in his ‘Code for Charge Storage
Computer’, a list of the machine’s instructions that is dated 30 November 1948. (Copeland
is grateful to Simon Lavington for sending him a copy of Kilburn’s document, seemingly a
lecture handout. Lavington included a retype of the document in his ‘Computer Development at
Manchester University’, in Metropolis, Howlett and Rota, A History of Computing in the Twentieth
Century, p. 439.)
285
Kilburn interviewed by Copeland, July 1997.
100 B.J. Copeland and G. Sommaruga
that were not B-modified unaffected’.286 A young assistant, Dai Edwards, was given
the job of developing the new piece of hardware.287
The B-tube was probably being used to run engineering test programs by
March 1949 and was ready for routine use in April 1949.288 It was part of an
extensive upgrade, completed in April, that transformed the computer from a small
demonstration model into a usable machine: the upgrade also included a magnetic
drum memory, improved CRT storage, a hardware multiplier, and an increase in
word length to 40 bits.289 Manchester was well ahead of the field, with a number
of other pioneering computer projects in the UK, US and Australia succeeding in
running stored programs later in 1949 (see the timeline in Fig. 3).
Was the less convenient method of instruction modification that Kilburn
mentioned—not involving the B-tube—ever actually implemented during the
period June 1948–April 1949? This is a tantalizing and important question, since
the first implementation of instruction editing was a key moment in the history of
computing. The method was important enough for Turing to discuss it in detail
in his Programmers’ Handbook: he described ‘the formation of an instruction
in the accumulator by addition, the copying of this instruction into the list of
instructions, and the subsequent obeying of it’ (adding, ‘This is a rather clumsy
process’).290 Turing gave several examples of small programs using this method
of instruction editing, in sections of the Handbook describing what he called the
‘reduced machine’. The ‘reduced machine’ is by and large the Manchester computer
as it existed before April 1949.
Thus it is certainly possible that Manchester’s first use of instruction edit-
ing preceded the arrival of the B-tube. But if so, no record appears to have
survived. Therefore the question of precisely when instruction editing was first
implemented—one of computer science’s most historic dates—remains open.
Acknowledgments Copeland is grateful to the following institutions for supporting this research:
University of Canterbury, New Zealand; University of Queensland, Australia; Federal Insti-
tute of Technology (ETH), Zurich, Switzerland; and Det Informationsvidenskabelige Akademi,
286
Williams, F. C., Kilburn, T. ‘The University of Manchester Computing Machine’, in Bowden,
Faster Than Thought, p. 122.
287
Lavington, S. H. A History of Manchester Computers (Manchester: NCC Publications 1975),
p. 12.
288
Tootill, ‘Digital Computer—Notes on Design & Operation’, note dated 27/3/49 inserted within
an older note dated 15/7/48; Dai Edwards in interview with Simon Lavington, 6 May 2015.
(Copeland is grateful to Lavington for making the transcript of this interview available.)
289
Williams and Kilburn, ‘The University of Manchester Computing Machine’, pp. 121–122;
Edwards in interview with Lavington, 6 May 2015; Kilburn, T. Tootill, G. C., Edwards, D. B.
G., Pollard, B. W. ‘Digital Computers at Manchester University’, Proceedings of the Institution of
Electrical Engineers, vol. 77 (1953), pp. 487–500.
290
Turing, Programmers’ Handbook for Manchester Electronic Computer Mark II, pp. 19–20. A
typo has been corrected: in Turing’s original, the second occurrence of ‘instruction’ was typed
‘instructions’.
The Stored-Program Universal Computer 101
Copenhagen University, Denmark. Copeland and Sommaruga are grateful to Herbert Bruderer,
Brian Carpenter, Martin Davis, Bob Doran, Simon Lavington, Teresa Numerico, Diane Proudfoot,
and Benjamin Wells for helpful comments on a draft of this chapter, and to the Deutsches Museum
for providing digital facsimiles of various original documents by Zuse.
Part II
Generalizing Turing Computability Theory
Theses for Computation and Recursion
on Concrete and Abstract Structures
Solomon Feferman
Abstract The main aim of this article is to examine proposed theses for compu-
tation and recursion on concrete and abstract structures. What is generally referred
to as Church’s Thesis or the Church-Turing Thesis (abbreviated CT here) must be
restricted to concrete structures whose objects are finite symbolic configurations
of one sort or another. Informal and principled arguments for CT on concrete
structures are reviewed. Next, it is argued that proposed generalizations of notions
of computation to abstract structures must be considered instead under the general
notion of algorithm. However, there is no clear general thesis in sight for that
comparable to CT, though there are certain wide classes of algorithms for which
plausible theses can be stated. The article concludes with a proposed thesis RT for
recursion on abstract structures.
1 Introduction
The concepts of recursion and computation were closely intertwined from the begin-
ning of the efforts early in the 1930s to obtain a conceptual analysis of the informal
notion of effective calculability. I provide a review of those efforts in Sect. 2 as
background to the remainder of this article, but I have nothing new to add here
to the extensive historical and analytical literature.1 It is generally agreed that the
conceptual analysis of effective calculability was first provided most convincingly
by Turing [3]. Not long before that Church [4] had proposed identifying effective
calculability with the Herbrand-Gödel notion of general recursiveness, soon enough
proved equivalent to Turing computability among other suggested explications.
1
Gandy [1] is an excellent introductory source to those developments; cf. also [2].
S. Feferman ()
Department of Mathematics, Stanford University, Stanford, CA, USA
e-mail: feferman@stanford.edu
2
Soare in his articles [2, 5] has justifiably made considerable efforts to reconfigure the terminology
of the subject so as to emphasize its roots in the notion of computation rather than recursion, for
example to write ‘c.e.’ for ‘computably enumerable’ in place of ‘r.e.’ for ‘recursively enumerable’,
but they do not seem to have overcome the weight of tradition.
3
Kleene’s wording derives from Turing [3, p. 249]: “No attempt has yet been made to show that
the ‘computable’ numbers include all numbers which would naturally be regarded as computable.”
[Italics mine]
4
Cf. Copeland [7] for an introductory article on the Church-Turing thesis.
Theses for Computation and Recursion on Concrete and Abstract Structures 107
analysis (aka explication of informal concepts) both settled and unsettled. Finally,
it may be hoped that satisfactory progress on these questions would help lead to
the same on concepts of feasibility in these various areas, concepts that are wholly
untouched here.
5
For convenience I shall only refer to Sieg [18] in the following, though there is considerable
overlap with Dawson [17].
Theses for Computation and Recursion on Concrete and Abstract Structures 109
Among examples, Herbrand lists the recursion equations for addition and mul-
tiplication, Gödel’s schemata for primitive recursion, Ackermann’s non-primitive
recursive function, and functions obtained by diagonalization. Gödel’s reply, though
friendly, was critical on a number of points, among which Herbrand’s proposal that
finitism could be encompassed in a single formal system.6
In the notes for his 1934 lectures on the incompleteness theorems at the IAS,
Gödel returned to Herbrand’s proposal in the last section under the heading “general
recursive functions.” He there recast Herbrand’s suggestion to define a new function
' in terms of “known” functions 1 , : : : , k and possibly ' itself. The main
requirement is now taken to be that for each set of natural numbers k1 , : : : ,kl there
is one and only one m such that '(k1 , : : : ,kl ) D m is a derived equation, where the
rules of inference for the notion of derivation involved are simply taken to be those
of the equation calculus, i.e. substitution of numerals for variables and substitution
of equals for equals (cf. [16], p. 26).
Before turning to this bridge to the conceptual analysis of effective calculability,
note that there are two features of Gödel’s definition of general recursive function
that make it ineffective in itself, namely (given “known” effectively calculable
functions 1 , : : : , k ) one can’t decide whether or not a given system of equations
E has (i) the uniqueness property, namely that for each sequence of arguments
k1 , : : : ,kl there is at most one m such that '(k1 , : : : ,kl ) D m is derivable from E, nor
that E has (ii) the existence property, namely that there is at least one such m for each
such sequence of arguments. In Kleene’s treatments of general recursion elaborating
the Herbrand-Gödel idea, beginning in 1936 and ending in Kleene [6] Ch. XI,
one may use any system of equations E to determine a partial recursive function,
simply by taking '(k1 , : : : ,kl )—when defined—to be the value m given by the least
derivation (in a suitable primitive recursive coding) ending in some value for the
given arguments. It is of course still undecidable whether the resulting function is
total. It finally remained for Kleene to give the most satisfactory general formulation
of effective recursion on the natural numbers via his Recursion Theorem (op. cit., p.
348). But this requires the notion of partial recursive functional to which we shall
return in Sect. 5 below.
Now, not only did Gödel in 1934 misremember the details of Herbrand’s 1931
formulation, he made a crucial conceptual shift there from the question of char-
acterizing the totality of finitistically acceptable functions to that of characterizing
the totality of functions given by a “finite computation” procedure, despite his clear
6
Gödel was not long after to change his mind about that; cf. Gödel [70].
110 S. Feferman
reservations both about the possibility of such and of general recursion in particular
as a prime candidate. Church was bolder:
We now define [sic!] the notion : : : of an effectively calculable function of positive integers
by identifying it with the notion of recursive function of positive integers (or of a -definable
function of positive integers). [4, p. 356]
7
Church [4, pp. 100–102] had another “step-by-step” argument for the Thesis, but there is a semi-
circularity involved; cf. Shagrir [23, p. 224].
Theses for Computation and Recursion on Concrete and Abstract Structures 111
to be the primary one. Only later, in his June 1964 Postscript to Gödel [16], does he
address the concept on its own terms as follows:
Turing’s work gives an analysis of the concept of “mechanical procedure” (alias “algo-
rithm” or “computation procedure” or “finite combinatorial procedure”). This concept is
shown to be equivalent with that of a “Turing machine”. (cf. [28], p. 370)
As described in Sect. 1 above, it was Kleene [6, pp. 300 et seq], who led one
to talk of Church’s Thesis, Turing’s Thesis, and then, ambiguously, of the Church-
Turing Thesis for the characterization through these equivalences of the effectively
calculable functions. In another influential text, Rogers [29, pp. 18 ff], took the
argument by confluence as one of the basic pins for CT and used that to justify
informal proofs “by Church’s Thesis”.
Beginning with work of Kolmogorov in 1953 (cf. [30]), efforts were made to move
beyond the argument by confluence, among others, to more principled arguments
for CT. The first significant step in that direction was made by Gandy [8] which,
together with its successors to be described below, may be construed as following a
more axiomatic approach, in the sense that one (cl)aims to isolate basic properties
of the informal notion of effective calculability or computation and proves that
any function computed according to those conditions is computable by a Turing
machine. As one sees by closer inspection, all the axioms used by the work in
question take the notion of finiteness for granted, hence may be construed formally
as carried out within weak second-order logic, but otherwise there is considerable
difference as to how they are formulated.
To begin with, Gandy asserts (as he did again in Gandy [1]) that Turing outlined
a proof of the following in his famous paper [3]:
Thesis T What can be calculated by an abstract human being working in a routine
way is computable [by a Turing machine].
Actually, such a statement, being non-mathematical, can’t be proved, nor did
Turing claim to have done so. Rather, what he did do in Sect. 9 of the paper was
to present three informal arguments as to why his analysis catches everything that
“would naturally be regarded as computable;” it is the first of these arguments that
leads most directly to the concept of a Turing machine. The argument in question
sets out five informal restrictive conditions on the idealized work space and possible
actions within it of a human computer. As recast by Sieg [31, 32], in order to
proceed to a theorem there are two steps involved: first, Turing’s conditions are
reformulated more generally in terms of boundedness, locality and determinacy
conditions (still at the informal level), and, secondly, those conditions are given
a precise mathematical expression for which it can be shown that any function
112 S. Feferman
satisfying them is computable by a Turing machine. (Sieg calls the second part a
representation theorem.)
By contrast to Thesis T, the aim of Gandy’s 1980 article was to argue for the
following:
Thesis M What can be calculated by a machine is computable [by a Turing
machine].
Again, there is no proof of Thesis M, but rather a two part procedure that
eventuates in a definite theorem. The first part consists of Gandy’s informal
restrictions on what constitute “mechanical devices” [8, pp. 125–126], namely, that
he excludes from consideration devices that are “essentially analogue machines”
and that “[t]he only physical presuppositions made about mechanical devices : : :
are that there is a lower bound on the linear dimensions of every atomic part of
the device and that there is an upper bound (the velocity of light) on the speed of
propagation of changes.” Furthermore, Gandy assumes that the calculations by a
mechanical device are describable in discrete terms and that the behavior of the
device is deterministic, though calculations may be carried out in parallel. These
restrictions then lead to the formulation of four mathematically precise principles
I-IV that express the informal conditions on what constitutes a machine in his
sense. The main result of Gandy (1980) is a theorem to the effect that any function
calculated by a mechanical device satisfying principles I-IV is computable on a
Turing machine.
By the way, in the informal part of his argument for Thesis M, Gandy enlarges
on the discreteness aspect in a way that is particularly useful for our purposes below.
Our use of the term “discrete” presupposes that each state of the machine can be adequately
described in finite terms. : : : [W]e want this description to reflect the actual, concrete,
structure of the device in a given state. On the other hand, we want the form of the
description to be sufficiently abstract to apply uniformly to mechanical, electrical or merely
notional devices. We have chosen to use hereditarily finite sets; other forms of description
might be equally acceptable. We suppose that the labels are chosen for the various parts
of the machine—e.g., for the teeth of cog wheels, for a transistor and its electrodes, for
the beads and wires of an abacus. Labels may also be used for positions in space (e.g., for
squares of the tape of a Turing machine) and for physical attributes (e.g., the color of a bead,
the state of a transistor, the symbol on a square). ([8] p. 127, italics mine)
In other words, just as with Thesis T, one is working throughout with finite
symbolic configurations.
Gandy’s case for Thesis M was substantially recast by Sieg [31, 32] much as
he had done for Thesis T. The first part of that work is again that in carrying out
effective calculations, the machine is limited by general boundedness, locality and
determinacy conditions, but those are now widened to allow acting on given finite
configurations in parallel and then reassembling the results into the next configura-
tion. That led Sieg to a statement of new simpler precise principles on mechanisms
as certain kinds of discrete dynamical systems for which a representation theorem is
proved, i.e. for which it is shown that whatever satisfies those principles “computes”
only Turing computable functions.
Theses for Computation and Recursion on Concrete and Abstract Structures 113
8
Dershowitz and Gurevich [33, p. 305] state that the aim of their work is “to provide a small
number of convincing postulates in favor of Church’s Thesis”; in that same article, pp. 339–342,
they provide a comprehensive survey of the literature sharing that aim, going back to work of
Kolmogorov in [30].
9
Cf. the Postscriptum to Sieg [35] for a detailed critique of this work.
114 S. Feferman
10
For a systematic treatment of computability on concrete structures see Tucker and Zucker [36].
11
The literature on generalized recursion theory is very extensive and could use an up-to-date
survey. Lacking that, some initial sources can be found in the bibliographies in the works of
Barwise [37], Fenstad [38], Sacks [39], and Tucker and Zucker [11].
Theses for Computation and Recursion on Concrete and Abstract Structures 115
such; and, finally, (4) test one of the Ri on specified registers and go to designated
other instructions depending on the value of the test (“if : : : then : : : else”). The
computation terminates only if the instructions of the form (3) and (4) are defined at
each stage where they are called and one eventually lands in the terminal instruction.
In that case the content of register r0 is the value of f (x1 , : : : , xn ). An n-ary relation
R is decidable by a fap if its characteristic function is computable by . The
class of fap computable partial functions on A is denoted by FAP(A). Friedman [9]
also gives an extensionally equivalent formulation of computability on A in terms
of generalized Turing machines, as well as one in terms of what he calls effective
definitional schemata given by an effective infinite enumeration of definition by
cases.
For the structure N D .N; 0; Sc; Pd; D/, where N is the set of natural numbers
and Sc and Pd are respectively the successor and predecessor operations (taking
Pd(0) D 0), FAP(N) is equal to the class of partial recursive functions. For general
structures A, Friedman [9] also introduced the notion of finite algorithmic procedure
with counting, in which certain registers are reserved for natural numbers and one
can perform the operations and tests on the contents of those registers that go
with the structure N. Then FAPC(A) is used to denote the partial functions on A
determined in this way.
The notion of finite algorithmic procedure is directly generalized to many-sorted
structures A D (A1 , : : : ,An , c1 , : : : ,cj , f1 , : : : ,fk , R1 , : : : ,Rm ); each register comes with
a sort index limiting which elements can be admitted as its contents. In particular,
FAPC(A) can be identified with FAP(A, N) where (A, N) denotes the structure A
augmented by that for N. A further extension of Friedman’s notions was made by
Moldestad et al. [43, 44], using stack registers that may contain finite sequences of
elements of any one of the basic domains Ai , including the empty sequence. The
basic operations for such a register are to remove the top element of a stack (pop)
and to add to the contents of one of the registers of type Ai (push). This leads to
the notion of what is computable by a finite algorithmic procedures with stacks,
FAPS(A), where we take the structure A to contain with each domain Ai the domain
Ai * of all finite sequences of elements of Ai , and with operations corresponding to
pop and push. If we want to be able to calculate the length n of a stack and the jth
element of a stack, we need also to have the structure N included. This leads to the
notion of finite algorithmic procedure with stacks and counting, whose computable
partial functions are denoted by FAPCS(A). In the case of the structure N, by any
one of the usual primitive recursive codings of finite sequences of natural numbers,
we have
It is proved in Moldestad et al. [43] that for each of these inclusions there is a
structure A which makes that inclusion strict.
An alternative approach to computability over arbitrary algebraic structures is
provided in a usefully detailed expository piece, Tucker and Zucker [11] that goes
back to their joint work [10]; this uses definition by schemata rather than (so-called)
machines. By a standard structure A is one that includes the structure B with
domain ft, f g and basic Boolean functions as its operations. The Tucker-Zucker
notion of computability for standard algebras is given by procedure statements
S: these include explicit definition, and are closed under composition, and under
statements of the form, if b then S1 else S2 , and while b do S, where ‘b’ is a Boolean
term. The set of partial functions computable on A by means of these schemata is
denoted by While(A). Then to deal with computability with counting, Tucker and
Zucker simply expand the algebra A to the algebra (A, N). To incorporate finite
sequences for each domain Ai , they make a further expansion of that to suitable
A*. The notions of computability WhileN .A/ and While*(A) over A are given
simply by While(A, N) and While(A*), respectively. The following result is stated
in Tucker and Zucker [11, p. 487] for any standard algebra A:
confluence of notions. Then the authors say that “[c]ompelling motivation clearly
would be required to justify yet a new model of computation” (op. cit., p. 22). And
that is claimed to come from a need to give foundations to the subject of numerical
analysis:
A major obstacle to reconciling scientific computation and computer science is the present
view of the machine, that is, the digital computer. As long as the computer is seen as a finite
or discrete object, it will be difficult to systematize numerical analysis. We believe that
the Turing machine as a foundation for real number algorithms can only obscure concepts.
Toward resolving the problem we have posed, we are led to expanding the theoretical model
of the machine to allow real numbers as inputs. (op. cit., p. 23)
following.12 The authors begin with the statement of a “naïve” generalized CT for
abstract algebras, namely that “[t]he functions that are ‘effectively computable’ on
a many-sorted algebra A are precisely the functions that are While* computable on
A.” This is immediately qualified as follows:
[T]he idea of effective calculability is complicated, as it is made up from many philosophical
and mathematical ideas about the nature of finite computation with finite or concrete
elements. For example, its analysis raises questions about the mechanical representation and
manipulation of finite symbols; about the equivalence of data representations; and about the
formalization of constituent concepts such as algorithm; deterministic procedure; mechan-
ical procedure; computer program; programming language; formal system; machine; and
the functions definable by these entities. : : : However, only some of these constituent
concepts can be reinterpreted or generalized to work in an abstract setting; and hence the
general concept, and term, of ‘effective computability’ does not belong in a generalization
of the Church-Turing thesis. In addition, since finite computation on finite data is truly a
fundamental phenomenon, it is appropriate to preserve the term with its established special
meaning. (Tucker and Zucker [11, p. 494], italics in the original.)
In other words, these authors and I are in complete agreement with the view
asserted at the end of the preceding section. Nevertheless, they go on to formulate
three versions of a generalized CT not using the notion of effective calculability,
corresponding to the three perspectives of algebra, programming languages, and
specification on data types; only the first of these is relevant to the discussion here.
Namely:
Tucker-Zucker thesis for algebraic computability. The functions computable by finite
deterministic algebraic algorithms on a many-sorted [first-order] algebra A are precisely
the functions While* computable on A. (op. cit., p. 495)
This goes back to the work of Tucker [47] on computing in algebraic structures;
cf. also Stoltenberg-Hansen and Tucker [48]. Hermann’s algorithm for the ideal
membership problem in K[x1 , : : : ,xn ] for arbitrary fields K is given as a paradigmatic
example, but there is no principled argument for this thesis analogous to the work
of Gandy, Sieg, Dershowitz and Gurevich described in the preceding section. One
may ask, for example, why the natural number structure and arrays are assumed
in the Tucker-Zucker Thesis, and why these suffice beyond the structure A itself.
Moreover, nothing is said about assuming that the equality relation for A is to be
included in it, even though that is common in algebraic algorithms. Finally, one
would like to see a justification of this thesis or possible variants comparable to the
ones described for classical CT, both informal and of a more formal axiomatic kind.
In any case, the Tucker-Zucker Thesis and supporting examples suggest that
all the notions of computability on abstract first order structures considered in
this section should be regarded as falling under a general notion of algorithm.
What distinguishes algorithms from computations is that they are independent
of the representation of the data to which they apply but only require how data
12
In addition, the reference Tucker and Zucker [10] is not as widely available as their year 2000
survey.
Theses for Computation and Recursion on Concrete and Abstract Structures 119
is packaged structurally, i.e. they only need consider the data up to structural
isomorphism. Friedman was already sensitive to this issue and that is the reason he
gave for baptizing his notion using generalized register machines, finite algorithmic
procedures:
The difference between [symbolic] configuration computations and algorithmic procedures
is twofold. Firstly, in configuration computations the objects are symbols, whereas in
algorithmic procedures the objects operated on are unrestricted (or unspecified). Secondly,
in configurational computations at each stage one has a finite configuration whose size is
not restricted before computation. On the other hand in algorithmic procedures one fixes
beforehand a finite number of registers to hold the objects. Thus for some n, at each stage
one has at most n objects. (Friedman [9], p. 362).
On the positive side he says that even though it is premature to try to propose
a general answer to the question, “What is an algorithm?,” convincing answers
have been given for large classes of such, among which sequential algorithms,
synchronous parallel algorithms and interactive sequential algorithms (cf. ibid for
references). In particular, the Tucker-Zucker Thesis or something close to it is a
plausible candidate for what one might call the Algebraic Algorithmic Procedures
Thesis. And more generally, it may be possible to distinguish algorithms used in
pure mathematics from those arising in applied mathematics and computer science,
where such algorithms as “interactive, distributed, real-time, analog, hybrid, quan-
tum, etc.” would fall. If there is a sensible separation between the two, Moschovakis’
13
See also Blass and Gurevich [49].
120 S. Feferman
Let us return to the claim of Blum et al. [41] that the BSS model of computation
on the reals (and complex numbers) is requisite for the foundations of the subject
of scientific computation. That was strongly disputed by Braverman and Cook
[50], where the authors argued that the requisite foundation is provided by a
quite different “bit computation” model that is prima facie incompatible with the
BSS model. It goes back to ideas due to Banach and Mazur in the latter part
of the 1930s, but the first publication was not made until Mazur [51]. In the
meantime, the bit computation model was refined and improved by Grzegorczyk
[52] and independently by Daniel Lacombe [53] in terms of a theory of recursively
computable functionals. Terminologically, something like “effective approximation
computability” is preferable to “bit computability” as a name for this approach in
its applications to analysis.
This competing approach was explained in Feferman [40] in rough terms as
follows. To show that a real valued function f on a real interval into the reals is
computable by effective approximation, given any x in the interval as argument
to f, one works not with x but rather with an arbitrary sequential representation
of x, i.e. with a Cauchy sequence of rationals hqn in 2 N which approaches x as
its limit, in order to effectively determine another such sequencehrmim2N which
approaches f (x) as limit. The sequences in question are functions from N to Q,
and so what is required is that the passage from hqn in2N to hrm im2N is given by
an effective type-2 functional on such functions. Write T for the class of all total
functions from N to N, and P for the class of all partial functions from N to N.
By the effective enumeration of the rational numbers, this reduces the notion of
effective approximation computability of functions f on the reals to that of effective
functionals F from T to T, and those in turn are restrictions to T of the partial
recursive functionals F 0 (from P to P ) whose values on total functions are always
total.14 It may be shown that by the continuity in the recursion theoretic sense of
partial recursive functionals we may infer continuity in the topological sense of
the functions f on the reals that are effective approximation computable. Thus step
functions that are computable in the BSS model are not computable in this sense.
14
Note that a partial recursive functional F need not have total values when restricted to total
arguments.
Theses for Computation and Recursion on Concrete and Abstract Structures 121
On the other hand, the exponential function is an example of one that is computable
in the effective approximation model that is not computable in the BSS model.15
The reader must be referred to Blum et al. [41] and Braverman and Cook [50]
for arguments as to which, if either of these, is the appropriate foundation for
scientific computation.16 I take no position on that here, but simply point out that
we have been led in a natural way from computation on the reals in the effective
approximation sense back to the partial recursive functionals F on partial functions
of natural numbers. Now Kleene’s principal theorem for such functionals is the
“first” Recursion Theorem, according to which each such F has a least fixed point
(LFP) f, i.e. one that is least among all partial functions g such that g D F(g) [6, p.
348]. This is fundamental in the following sense: the partial recursive functions and
functionals are just those generated by closing under explicit definition and LFP
recursion over the structure N. For, first of all, one immediately obtains closure
under the primitive recursive schemata. Then, given primitive recursive g(x, y),
one obtains the function f (x) ' (y)[g(x, y) D 0] by taking f (x) ' h(x, 0) where h(x,
z) ' z if (8y < z) [g(x, y) > 0 ^ g(x, z) D 0], else h(x, z0 ). It follows that all partial
recursive functions (and thence all partial recursive functionals) are obtained by
Kleene’s Normal Form Theorem.
This now leads one to consider generation of partial functions and functionals by
explicit definition and LFP recursion over arbitrary abstract many-sorted structures
A. The development of that idea originates with Platek [63], a PhD thesis at Stanford
that, though never published, came to be widely known by workers in the field.
15
There is a considerable literature on computation on the real numbers under various approaches
related to the effective approximation one via Cauchy representations. A more comprehensive
one is that given by Kreitz and Weihrauch [54, 55] and Weihrauch [56]; that features surjective
representations from a subset of NN to R. Bauer [57] introduced a still more general theory
of representations via a notion of realizability, that allows one to consider classical structures
and effective structures of various kinds (including those provided by domain theory) under a
single framework; cf. also Bauer and Blanck [58]. The work of Pour-El surveyed in her article
[59] contains interesting applications of the effective approximation approach to questions of
computability in physical theory.
16
Cf. also Blum [60], to which Braverman and Cook [50] responds more directly. Actually, the
treatment of a number of examples from numerical analysis in terms of the BSS model that takes
up Part II of Blum et al. [41] via the concept of the “condition number” of a procedure in a way
brings it in closer contact with the effective approximation model. As succinctly explained to me
in a personal communication from Lenore Blum, “[r]oughly, ‘condition’ connects the BSS/BCSS
theory with the discrete theory of computation/complexity in the following way: The ‘condition’
of a problem instance measures how outputs will vary under perturbations of the input (think of
the condition as a normed derivative).” The informative article, Blum [61], traces the idea of the
condition number back to a paper by Turing [62] on rounding-off errors in matrix computations
from where it became a basic common concept in various guises in numerical analysis. (An
expanded version of Blum [61] is forthcoming.) It may be that the puzzle of how the algebraic BSS
model serves to provide a foundation for the mathematics of the continuous, at least as it appears in
numerical analysis, is resolved by noting that the verification of the algorithms it employs requires
in each case specific use of properties of the reals and complex numbers telling which such are
“well-conditioned.”
122 S. Feferman
17
Platek [63] also used the LFP approach to subsume recursion theory on the ordinals under the
theory of recursion in the Sup functional.
Theses for Computation and Recursion on Concrete and Abstract Structures 123
Theorem show that the partial functions and functionals generated by the abstract
computation procedures are just those that are partial recursive. This shows that the
effective approximation approach to computation on the reals is accounted for at the
second-order level under ACP(N), while the Xu-Zucker result shows that the BSS
model is subsumed at the first-order level under ACP(R).
Clearly it is apt to use the word ‘abstract’ in referring to the procedures in
question since they are preserved under isomorphism. But given the arguments
I have made in the preceding sections, it was a real mistake on my part to use
‘computation’ as part of their designation, and I very much regret doing so. A better
choice would have been simply to call them Abstract Recursion Procedures, and
I have decided to take this occasion to use ‘ARP’ as an abbreviation for these, in
place of ‘ACP’, thus ARP(A) in place of ACP(A). The main point now is to bring
matters to a conclusion by using these to propose the following thesis on definition
by recursion that in no way invokes the concepts of computation or algorithm.
Recursion Thesis (RT) Any function defined by recursion over a first-order
structure A (with Booleans) belongs to ARP(A).
This presumes an informal notion of being a function f defined by recursion over
a first- order structure A that is assumed to include the Boolean constants and basic
operations. Roughly speaking, the idea for such a definition is that f is determined
by an equation f (x) ' E(f, x).where E is an expression that may contain a symbol for
f and symbols for the initial functions and constants of A as well as for functions
previously defined by recursion over A. Now here is the way such a justification
for RT might be argued. At any given x D x0 , f (x) may not be defined by E, for
example if E(f, x) D [if x D x0 and f (x) D 0 then 1, else 0]. But if f (x) is defined at
all by E, it is by use made of values f (y) that are previously defined. Write y x
if f (y) is previously defined and its value is used in the evaluation of x; then let fx
be f restricted to fy : y xg. Thus the evaluation of f (x) is determined by fx when
it is defined, i.e. f (x) D E(fx , x) for each such x. It may be that fy W y xg is empty
if f (x) is defined outright in terms of previous functions; in that case x is minimal
in the relation. In the case it is not empty, we may make a similar argument for
f (y) for each y x, and so on. In order for this to terminate, the relation must
be well-founded. Next, take F to be the functional given by F(f, x) D E(fx , x); F
is monotonic increasing, because if f g then fx D gx . So F has a LFP g. But F
defines our function f by transfinite recursion on , so f is a fixed point of F and
hence g f. To conclude that f g, we argue by transfinite recursion on : for a
given x, if f (y) D g(y) for all y x then f (x) D F(f, x) D E(fx , x) D E(gx , x) D F(g,
x) D g(x). Thus f is given by LFP recursion in terms of previously obtained functions
in ARP(A) and hence itself belongs to ARP(A).
The reason for restricting to first-order structures A in the formulation of RT is
so as not to presume the property of monotonicity as an essential part of the idea
of definition by recursion. I should think that all this can be elaborated, perhaps in
an axiomatic form, but if there is to be any thesis at all for definition by recursion
over an arbitrary first-order structure (with Booleans), I cannot see that it would
differ in any essential way from RT. If there is a principled argument for assuming
124 S. Feferman
Acknowledgements I wish to thank Lenore Blum, Andrej Bauer, John W. Dawson, Jr., Nachum
Dershowitz, Yuri Gurevich, Grigori Mints, Dana Scott, Wilfried Sieg, Robert Soare, John V.
Tucker, and Jeffery Zucker for their helpful comments on an early draft of this article. Also
helpful were their pointers to relevant literature, though not all of that extensive material could
be accounted for here.
References
1. R.O. Gandy, The confluence of ideas in 1936, in The Universal Turing Machine. A Half-
Century Survey, ed. by R. Herken (Oxford University Press, Oxford, 1988), pp. 55–111
2. R.I. Soare, The history and concept of computability, in Handbook of Computability Theory,
ed. by E. Griffor (Elsevier, Amsterdam, 1999), pp. 3–36
3. A. Turing, On computable numbers, with an application to the Entscheidungsproblem, in
Proceedings of the London Mathematical Society. Series 2, vol. 42 (1936–1937), pp. 230–265;
a correction, ibid. 43, 544–546
4. A. Church, An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363
(1936)
5. R.I. Soare, Computability and recursion. Bull. Symb. Log. 2, 284–321 (1996)
6. S.C. Kleene, Introduction to Metamathematics (North-Holland, Amsterdam, 1952)
7. B.J. Copeland, The Church-Turing Thesis (Stanford Encyclopedia of Philosophy 2002), http://
plato.stanford.edu/entries/church-turing/
8. R.O. Gandy, Church’s thesis and principles for mechanisms, in The Kleene Symposium, ed. by
J. Barwise, H.J. Keisler, K. Kunen (North-Holland, Amsterdam, 1980), pp. 123–145
9. H. Friedman, Algorithmic procedures, generalized Turing algorithms, and elementary recur-
sion theory, in Logic Colloquium ’69, ed. by R.O. Gandy, C.M.E. Yates (North-Holland,
Amsterdam, 1971), pp. 361–389
10. J.V. Tucker, J.I. Zucker, Program Correctness over Abstract Data Types (CWI Monograph,
North-Holland, Amsterdam, 1988)
11. J.V. Tucker, J.I. Zucker, Computable functions and semicomputable sets on many-sorted
algebras, in Handbook of Logic in Computer Science, vol. 5, ed. by S. Abramsky, et al. (Oxford
University Press, Oxford, 2000), pp. 317–523
12. Y.N. Moschovakis, The formal language of recursion. J. Symb. Log. 54, 1216–1252 (1989)
13. Y.N. Moschovakis, What is an algorithm? in Mathematics Unlimited 2001 and Beyond, ed. by
B. Engquist, W. Schmid (Springer, Berlin, 2001), pp. 919–936
14. Y. Gurevich, What is an algorithm? in SOFSEM 2012: Theory and Practice of Computer
Science, ed. by M. Bielikova et al. (Springer, Berlin, 2012), LNCS, vol. 7147, pp. 31–42
15. D. Hilbert, Über das Unendliche. Math. Ann. 95, 161–190 (1926); English translation in van
Heijenoort (ed.), From Frege to Gödel. A Source Book in Mathematical Logic, 1879–1931
(Harvard University Press, Cambridge, 1967), pp. 367–392
16. K. Gödel, On undecidable propositions of formal mathematical systems, (mimeographed
lecture notes by S.C. Kleene and J.B. Rosser); reprinted with revisions in M. Davis (1965)
(ed.), in The Undecidable. Basic Papers on Undecidable Propositions, Unsolvable Problems,
and Computable Functions (Raven Press, Hewlett, 1934), pp. 39–74, and in Gödel (1986), pp.
346–371
17. J.W. Dawson Jr., Prelude to recursion theory: the Gödel-Herbrand correspondence, in First
International Symposium on Gödel’s Theorems, ed. by Z.W. Wolkowski (World Scientific,
Singapore, 1993), pp. 1–13
Theses for Computation and Recursion on Concrete and Abstract Structures 125
45. L. Blum, M. Shub, S. Smale, On a theory of computation and complexity over the real numbers:
NP-completeness, recursive functions and universal machines. Bull. Am. Math. Soc. 21, 1–46
(1989)
46. H. Friedman, R. Mansfield, Algorithmic procedures. Trans. Am. Math. Soc. 332, 297–312
(1992)
47. J.V. Tucker, Computing in algebraic systems, in Recursion Theory, Its Generalizations and
Applications, ed. by F.R. Drake, S.S. Wainer (Cambridge, University Press, Cambridge, 1980)
48. V. Stoltenberg-Hansen, J.V. Tucker, Computable rings and fields, in Handbook of Computabil-
ity Theory, ed. by E. Griffor (Elsevier, Amsterdam, 1999), pp. 363–447
49. A. Blass, Y. Gurevich, Algorithms: a quest for absolute definitions. Bull. EATCS 81, 195–225
(2003)
50. M. Braverman, S. Cook, Computing over the reals: foundations for scientific computing. Not.
Am. Math. Soc. 51, 318–329 (2006)
51. S. Mazur, Computable analysis. Rozprawy Matematyczne 33, 1–111 (1963)
52. A. Grzegorczyk, Computable functionals. Fundamenta Mathematicae 42, 168–202 (1955)
53. D. Lacombe, Extension de la notion de fonction récursive aux fonctions d’une ou plusieurs
variables réelles, I, II, III, Comptes Rendus de l’Académie des Science Paris, 240, 2470–2480
(1955) (241, 13–14, 241, 151–155)
54. C. Kreitz, K. Weihrauch, A unified approach to constructive and recursive analysis, in
Computation and Proof Theory, ed. by M. Richter, et al. Lecture notes in mathematics, vol.
1104 (1984), pp. 259–278
55. C. Kreitz, K. Weihrauch, Theory of representations. Theor. Comput. Sci. 38, 35–53 (1985)
56. K. Weihrauch, Computable Analysis (Springer, New York, 2000)
57. A. Bauer, The Realizability Approach to Computable Analysis and Topology, PhD Thesis,
Carnegie Mellon University, 2000; Technical Report CMU-CS-00-164
58. A. Bauer, J. Blanck, Canonical effective subalgebras of classical algebras as constructive metric
completions. J. Universal Comput. Sci. 16(18), 2496–2522 (2010)
59. M.B. Pour-El, The structure of computability in analysis and physical theory, in Handbook of
Computability Theory, ed. by E. Griffor (Elsevier, Amsterdam, 1999), pp. 449–471
60. L. Blum, Computability over the reals: where Turing meets Newton. Not. Am. Math. Soc. 51,
1024–1034 (2004)
61. L. Blum, Alan Turing and the other theory of computation, in Alan Turing: His Work and
Impact, ed. by Cooper and van Leeuwen (Elsevier Pub. Co., Amsterdam, 2013), pp. 377–384
62. A. Turing, Rounding-off errors in matrix processes. Q. J. Mech. Appl. Math. 1, 287–308
(1948); reprinted in Cooper and van Leeuwen (2013), pp. 385–402
63. R.A. Platek, Foundations of Recursion Theory, PhD Dissertation, Stanford University, 1966
64. S.C. Kleene, Recursive functionals and quantifiers of finite type I. Trans. Am. Math. Soc. 91,
1–52 (1959)
65. A. Kechris, Y. Moschovakis, Recursion in higher types, in Handbook of Mathematical Logic,
ed. by J. Barwise (North-Holland Pub. Co., Amsterdam, 1977), pp. 681–737
66. Y.N. Moschovakis, Abstract recursion as a foundation for the theory of recursive algorithms, in
Computation and Proof Theory ed. by M.M. Richter, et al. Lecture notes in computer science,
vol. 1104 (1984), pp. 289–364
67. S. Feferman, A new approach to abstract data types, I: informal development. Math. Struct.
Comput. Sci. 2, 193–229 (1992)
68. S. Feferman, A new approach to abstract data types, II: computability on ADTs as ordinary
computation, in Computer Science Logic, ed. by E. Börger, et al. Lecture notes in computer
science, vol. 626 (1992), pp. 79–95
69. J. Xu, J. Zucker, First and second order recursion on abstract data types. Fundamenta
Informaticae 67, 377–419 (2005)
70. K. Gödel, The present situation in the foundations of mathematics (unpublished lecture), in
Gödel (1995), pp. 45–53
Generalizing Computability Theory to Abstract
Algebras
Abstract We present a survey of our work over the last four decades on general-
izations of computability theory to many-sorted algebras. The following topics are
discussed, among others: (1) abstract v concrete models of computation for such
algebras; (2) computability and continuity, and the use of many-sorted topological
partial algebras, containing the reals; (3) comparisons between various equivalent
and distinct models of computability; (4) generalized Church-Turing theses.
1 Introduction
J.V. Tucker
Department of Computer Science, Swansea University, Singleton Park, Swansea SA2 8PP,
Wales, UK
e-mail: J.V.Tucker@swansea.ac.uk
J.I. Zucker ()
Department of Computing and Software, McMaster University, Hamilton, ON, Canada L8S 4K1
e-mail: zucker@mcmaster.ca
1
At one time it was possible for us to investigate most of the mathematical models, and compare
and classify them [29, 31]!
Generalizing Computability Theory to Abstract Algebras 129
continuous operations, such as the real numbers. At this point the theory deepens.
New questions arise and there are a number of changes to the imperative model,
not least the use of algebras that have partial operations and tests and the need for
nondeterministic constructs. Finally (Sect. 8) we take a quick look other models and
propose some generalizations of the Church-Turing thesis to abstract many-sorted
algebras.
By a computability theory we mean a theory of functions and sets that are definable
using a model of computation. By a model of computation we mean a theoretical
model of some general method of calculating the value of a function or of deciding,
or enumerating, the elements of a set. We allow the functions and sets to be
constructed from any kind of data. Thus, classical computability theory on the set N
of natural numbers is made up of many computability theories (based upon Turing
machines, recursive definitions, register machines, etc.).
We divide computability theories into two types:
In an abstract computability theory the computations are independent of all the
representations of the data. Computations are uniform over all representations and
are necessarily isomorphism invariant. Typical of abstract models of computation
are models based on abstract ideas of program, equation, recursion scheme, or
logical formula.
In a concrete computability theory the computations are dependent on some data
representation. Computations are not uniform, and different representations can
yield different results. Computations are not automatically isomorphism invariant.
Typical of concrete models of computation are those based on concrete ideas of
coding, numbering, or data representations using numbers or functions.
Now in computer science, it is obvious that a computation is fundamentally
dependent on its data. By a data type we mean
(1) data,
together with
(2) some primitive operations and tests on these data.
Often we also have in mind the ways these data are
(3) axiomatically specified, and
(4) represented or implemented.
To choose a computation model, we must think carefully about what forms of
data the user may need, how we might model the data in designing a system—
where some high level but formal understanding is important—and how we might
implement the data in some favoured programming language.
130 J.V. Tucker and J.I. Zucker
2
For example: for all n there exists a computable algebra with precisely n inequivalent computable
numberings.
Generalizing Computability Theory to Abstract Algebras 131
As an example for the l.h.s. here, let us take the data set D D R, the set of
reals, the structure A D RNp , the partial algebra Rp of reals defined below in Sect. 7,
with the naturals adjoined, and AbstCompA .D/ D While .RNp /, the set of functions
on R definable by the While programming language with arrays over RNp defined in
Sect. 5. For the r.h.s., take a standard enumeration ˛ W N ! Q of the rationals, which
generates a representation ˛ of the computable reals (as described in Sect. 7.1.1
below), and let ConcComp˛ .R/ be the corresponding “˛-tracking” model. Then
our abstract model is sound, but not adequate, for this concrete model:
On the other hand, if we take for our abstract model over Rp the non-deterministic
WhileCC (While C “countable choice” C arrays) language, and further replace
“computability” by “approximable computability” [33, 34], then we obtain com-
pleteness (see Sect. 7.3 below):
F W s1
sm ! s .m 0/:
F A W Au ! As
where u D s1
sm , and Au D As1
Asm .
We write ˙ .A/ for the signature of an algebra A.
A ˙ -algebra is called total if all the basic functions are total; it is called partial
in the absence of such an assumption. Sections 3–6 will be devoted to total algebras.
In Sect. 7 we will turn to a more general theory, with partial algebras.
Example 3.1.1 (Booleans) The signature ˙ .B/ of the booleans is
signature ˙ .B/
sorts bool
functions true; false W ! bool,
not W bool ! bool
or; and W bool2 ! bool
The algebra B of booleans contains the carrier B D f ; g of sort bool, and the
standard interpretations of the constant and function symbols of ˙ (B).
Note that for a structure A to be useful for computational purposes, it should be
susceptible to testing, which means it should contain the carrier B of booleans and
the standard boolean operations; in other words, it should contain the algebra B as a
retract. Such an algebra A is called standard. All the examples of algebras discussed
below will be standard.
Generalizing Computability Theory to Abstract Algebras 133
signature ˙ .N /
import ˙ .B/
sorts nat
functions 0 W ! nat,
suc W nat ! nat
eqN ; lessN W nat2 ! bool
signature ˙.Rt /
import ˙.B/
sorts real
functions 0; 1 W ! real,
C; W real2 ! real;
W real ! real;
eqR , lessR : real2 !bool
(We will study a partial algebra of reals in Sect. 7.) The algebra Rt of reals has the
carrier R of sort real, as well as the imported carrier B of sort bool with the boolean
operations, the real constants and operations (0; 1; C; ; ), and the (total) boolean-
valued functions eqR W R2 ! B. and lessR W R2 ! B. Again, we will use the infix
notations “D” and “<” for these.
Definition 3.1.4 (Minimal Carriers; Minimal Algebra) Let A be a ˙ -algebra,
and s a ˙ -sort.
(a) A is minimal at s (or the carrier As is minimal in A) if As is generated by the
closed ˙-terms of sort s.
(b) A is minimal if it is minimal at every ˙-sort.
To take two examples:
• Every N-standard algebra (see Sect. 3.3) is minimal at sorts bool and nat.
• The algebra Rt of reals (Example 3.1.3) is not minimal at sort real.
134 J.V. Tucker and J.I. Zucker
ŒŒt A W State.A/ ! As
ŒŒxA D .x/
(2)
ŒŒF.t1 ; : : : ; tm /A D F A .ŒŒt1 A ; : : : ; ŒŒtm A /
A signature ˙ is N-standard if (1) it is standard (see Sect. 3.1), and (2) it contains
the standard signature of naturals (Example 3.1.2), i.e., ˙ .N / ˙.
Given an N-standard signature ˙ , a ˙ -algebra A is N-standard if it is an expan-
sion of N , i.e., it contains the carrier N with the standard arithmetic operations.
N-standardness is clearly very useful in computation, with the presence of
counters, and the facility for enumerations and numerical coding.
Any standard signature ˙ can be “N-standardised” to a signature ˙ N by
adjoining the sort nat and the operations 0, suc, eqN and lessN . Correspondingly,
any standard ˙ -algebra A can be N-standardised to an algebra AN by adjoining the
carrier N together with the corresponding arithmetic and boolean functions on N.
Generalizing Computability Theory to Abstract Algebras 135
Note that we will use “” to denote syntactic identity between two expressions.
We define Stmt.˙ / to be the class of While.˙ /-statements S; : : : generated by:
where S is the body, and and a, b and c are tuples of (distinct) input, output and
auxiliary variables respectively.
If a W u and b W v, then P is said to have type u ! v, written P W u ! v.
We turn to the semantics of statements and procedures.
The meaning ŒŒSA of a statement S is a partial state transformer4 on an algebra A:
3
I.e., we can effectively find a ˙ -term t0 such that ŒŒt0 A D ŒŒtA for all ˙ -algebras A and states
over A (or A).
4
“*” denotes a partial function.
136 J.V. Tucker and J.I. Zucker
Its definition is standard [30, 31] and lengthy, and so we omit it. Briefly, it is based
on defining the computation sequence of states from S starting in a state , or rather
the n-th component of this sequence, by a primary induction on n, and a secondary
induction on the size of S.
Next, given a procedure (3) of type u ! v, its meaning is a partial function PA W
A * Av defined as follows. For a 2 Au , let be any state on A such that Œa D a,
u
and Œb and Œc are given preassigned default values. Then
(
0 Œb if ŒŒSA # 0 (say)
P .a/ '
A
" if ŒŒS ":
A
Here “'” means that the two sides either both converge to the same value, or both
diverge (“Kleene equality” [15, Sect. 63]).
We are also using the notation ŒŒSA # to mean that evaluation of ŒŒSA at halts
or converges; ŒŒSA # 0 that it converges to 0 , and ŒŒSA " that it diverges.
It is worth noting that the semantics of While.˙ / procedures is invariant under
˙-isomorphism.
Modifications in these semantic definitions (equations (2) in Sect. 3.2, and (3) in
Sect. 4) required for partial algebras will be indicated in Sect. 7 (Remark 7.3.1).
(c) While computability will be the basis for a generalized Church-Turing Thesis,
as we will see later (Sect. 8.2). On the other hand, WhileN computability is
useful for representing the syntax of While programming within the formalism,
by means of coding (Sect. 5).
RepAx W State.A/ ! Au
is defined by
defined by
TEA
x,s
code; Rep A
x
Tm x,s × Au As
te A
x,s
defined by
where is any state on A such that Œx D a, in the sense that the diagram in Fig. 1
commutes.
We will be interested in the computability of this term evaluation representing
function.
Definition 5.3.1 (Term Evaluation) The algebra A has the term evaluation prop-
erty (TEP) if for all x and s, the term evaluation representing function te Ax;s is
While computable on AN .
Many well-known varieties (i.e., equationally axiomatisable classes of algebras)
have (uniform versions of) the TEP. Examples are: semigroups, groups, and associa-
tive rings with or without unity. This follows from the effective normalisability of the
terms of these varieties. In the case of rings, this means an effective transformation
of arbitrary terms to polynomials.
Thus, for example, the algebra Rt of reals has the TEP.
Theorem 2 The term evaluation representing function on A is While computable
on A .
Corollary 5.3.2 The term evaluation representing function on A is While com-
putable on AN .
Recall the Definition 3.1.4 of minimal carriers.
Corollary 5.3.3
(a) If A is minimal at s, then there is a While computable enumeration .or listing/
of the carrier As , i.e., a surjective total mapping
enumAs W N ! As ;
Univu ! v W nat u ! v
which is universal for Procu ! v on A, in the sense that for all P 2 Procu ! v
and a 2 Au ,
Univu ! v W nat u ! v
which is universal for Procu ! v on A, in the sense that for all P 2 Procu ! v
and a 2 Au ,
We conclude that there are universal WhileN procedures for While computation
on Rt .
6 Concepts of Semicomputability
In the case of union, if we assume that (1) A is N-standard, and (2) A has the
TEP, then the construction of the “merge” of the two characteristic procedures, i.e.,
interleaving their steps to form the new procedure, simply follows the classical proof
for computation on N. Failing this, the construction of the merge procedure (by
structural induction on the pair of statements) is quite challenging. (The tricky case
is where both are “while” statements.)
If R is a relation on A of type u, we write the complement of R as Rc D Au nR.
Theorem 5 (Post’s Theorem for While Semicomputability) For any relation R
on A
Note that the proofs of the above two theorems depend strongly on the totality of
A. (See Remark 7.3.2.)
Another useful closure result, applicable to N-standard structures, is:
Theorem 6 (Closure of While Semicomputability Under N-Projections) Sup-
pose A is N-standard. If R Aunat is While semicomputable on A, then so is
its N-projection fx 2 Au j 9n 2 N R.x; n/g.
To outline the proof: From a procedure P which halts on R, we can effectively
construct another procedure which halts on the required projection. Briefly, for input
x, we search by “dovetailing” for a number n such that P halts on .x; n/.
We can generalize this theorem to the case of an As -projection for any minimal
carrier As (recall Definition 3.1.4), provided A has the TEP.
Corollary 6.1.1 (Closure of While Semicomputability Under Projections off
Minimal Carriers) Suppose A is N-standard and has the TEP. Let As be a minimal
carrier of A. If R Aus is While semicomputable on A, then so is its projection
off As .
Note that Corollary 6.1.1 is a many-sorted version of (part of) Theorem 2.4 of
[9], cited in [23]. The minimality condition (a version of Friedman’s Condition III)
means that search in As is computable (or, more strictly, semicomputable) provided
A has the TEP. Thus in minimal algebras, many of the results of classical recursion
theory carry over, e.g.,
• the domains of semicomputable sets are closed under projection (as above)
• a semicomputable relation has a computable selection function
• a function with semicomputable graph is computable [9, Theorem 2.4].
If, in addition, there is computable equality at the appropriate sorts, other results of
classical recursion theory carry over, e.g.,
• the range of a computable function is semicomputable [9, Theorem 2.6].
Generalizing Computability Theory to Abstract Algebras 143
5
We are suppressing sort superscripts here.
144 J.V. Tucker and J.I. Zucker
Note again that the TEP does not have to be assumed here (cf. Theorem 7). Also
we are using the fact that if A is minimal then so is A .
Example 6.4.1 In N , the various concepts we have listed: While, WhileN and
While semicomputability, as well as projective While, WhileN and While semi-
computability, all reduce to recursive enumerability over N.
In general, however, projective While semicomputability is strictly broader than
projective WhileN semicomputability. In other words, projecting along starred sorts
is stronger than projecting along simple sorts or nat. (Intuitively, this corresponds
to existentially quantifying over a finite, but unbounded, sequence of elements.) An
example to show this will be given below.
We do, however, have the following equivalence:
For any While statement S over ˙ , we can define a (possibly infinite) computation
tree for S. The construction is by structural induction on S. Details are given in [31].
Using this and the ˙ /˙ conservativity theorem (Theorem 1), we can prove the
following. Let R be a relation on A.
Theorem 8 (Engeler’s Lemma for While Semicomputability) R is While
semicomputable over A iff R can be expressed as an effective countable disjunction
of booleans over ˙ .
Generalizing Computability Theory to Abstract Algebras 145
The following theorem uses Engeler’s Lemma, and the While computability of
term evaluation.
Theorem 10 (Projective Equivalence Theorem) The following are equivalent:
(i) R is projectively While semicomputable on A;
(ii) R is projectively While computable on A.
We can strengthen the theorem with a third equivalent clause, if we add an
assumption about computability of equality in A.
First we must define certain syntactic classes of formulae over ˙ .
Let Lang D Lang.˙ / be the first order language with equality over ˙ . We
are interested in special classes of formulae of Lang .
Formulae of Lang are formed from the atomic formulae by means of the
propositional connectives and universal and existential quantification over variables
of any ˙ -sort.
Definition 6.6.1 (Classes of Formulae of Lang.˙ /)
(a) An atomic formula is an equality between a pair of terms of the same ˙ -sort.
(b) A bounded quantifier has the form ‘8k < t’ or “9k < t”, where t W nat.
(c) An elementary formula is one with only bounded quantifiers.
(d) A †1 formula is formed by prefixing an elementary formula with existential
quantifiers only.
(e) An extended †1 formula is formed by prefixing an elementary formula with a
string of existential quantifiers and bounded universal quantifiers (in any order).
146 J.V. Tucker and J.I. Zucker
We can show that an extended †1 formula is equivalent to a †1 formula over ˙ .
Hence we will use the term “†1 ” to denote (possibly) extended †1 formulae.
We can now re-state the projective equivalence theorem in the presence of
equality.
Theorem 10D (Projective Equivalence Theorem for ˙ with Equality) Sup-
pose ˙ has an equality operator at all sorts. Then the following are equivalent:
(i) R is projectively While semicomputable on A;
(ii) R is projectively While computable on A;
(iii) R is †1 definable.
We apply some of the above ideas and results to the algebra Rt . Details can be found
in [31, Sects. 6.2, 6.3]6
We begin again with a restatement of the semicomputability equivalence theorem
(Theorem 9), for the particular case of Rt .
Theorem 11 (Semicomputability for Rt ) Suppose R Rn .n D 1; 2; : : :/. Then
the following are equivalent:
(i) R is WhileN semicomputable on Rt ,
(ii) R is While semicomputable on Rt ,
(iii) R can be expressed as an effective countable disjunction
_
x 2 R () bi .x/
i
6
The notation in [31] is unfortunately not completely consistent with the present notation: R and
R< in [31] correspond (resp.) to Rot and Rt here.
Generalizing Computability Theory to Abstract Algebras 147
The proof of equivalence between (iii) and the other two concepts follows from
the fact that semialgebraic sets are closed under projection, which in turn follows
from Tarski’s quantifier-elimination theorem for real closed fields [16, Chap. 4].
Interestingly, in the algebra Rot (i.e., Rt without the order relation “<”),
where Tarski’s theorem fails, one can find an example of a relation (namely, “<”
itself!) which is projectively While semicomputable, but not While (or While )
semicomputable.
On the other hand (returning to Rt ) the three equivalent concepts of semicom-
putability given in Theorem 12 differ from a fourth:
(iv) projective While semicomputability,
as we now show.
Example 6.7.1 (A set which is projectively While semicomputable, but not projec-
tively WhileN semicomputable) In order to prepare for this example, we must first
enrich the structure Rt . Let E D fe0 ; e1 ; e2 ; : : :g be a sequence of reals such that
algebra REt
import Rt
carriers E
functions jWE!R
When one considers the relation between abstract and concrete models, a number
of intriguing problems appear. We will explain them by considering a series of
examples based on the data type of real numbers. Then we formulate our strategy
for solving these problems. The picture for topological algebras in general will be
clear from our examples with the reals.
7.1.2 Continuity
Computations with real numbers involve infinite data. Computations are finite
processes that approximate in some way infinite processes. The topology of R
defines a process of approximation for infinite data; the functions on the data that
are continuous in the topology are exactly the functions that can be approximated to
any desired degree of precision. This suggests a continuity principle:
7.1.3 Partiality
pivW Rn * f 1; : : : ; n g
by
(
some i W xi ¤ 0 if such an i exists
piv.x1 ; : : : ; xn / D (6)
" otherwise:
pivW CSn * f 1; : : : ; n g
CSn
piv
n
?
Rn {1; : : : ; n}
defined, for k D 1; : : : ; n, by
k 2 pivm .x1 ; : : : ; xn / () xk ¤ 0
7
Figures 2 and 3 are taken by kind permission from [33], © 2004 ACM Inc.
Generalizing Computability Theory to Abstract Algebras 151
y
y = fa (x)
a
1
x
-1 0
g(a)
-1
a
0 1
This function has either 1 or 3 roots, depending on the size of a. For a < 1, fa
has a single (large positive) root; for a > 1, fa has a single (large negative) root;
and for 1 < a < 1, fa has three roots, two of which become equal when a D ˙1.
Let g be the (many-valued) function, such that g.a/ gives all the non-repeated
roots of fa (Fig. 3). Again we have the situation of the previous examples:
(a) We cannot choose a (single) root of fa continuously as a function of a.
(b) However, one can easily choose and compute a root of fa continuously as a
function of a Cauchy sequence representation of a, i.e., non-extensionally in a.
(c) Finally, g.a/, as a many-valued function of a, is continuous. (Note that in order
to have continuity, we must exclude the repeated roots of fa , at a D ˙1.)
Other examples of a similar nature abound, and can be handled similarly; for
example, the problem of finding, for a given real number x, an integer n > x.
152 J.V. Tucker and J.I. Zucker
At the level of concrete models of computation, there is no real problem with the
issues raised by these examples, since concrete models work only by computations
on representations of the reals (say by Cauchy sequences).
The real problem arises with the construction of abstract models of computation
on the reals which should model the phenomena illustrated by these examples, and
also correspond, in some sense, to the concrete models.
An immediate problem in this regard is that the total boolean-valued functions
eqR and lessR are not continuous, and hence also (by the continuity principle,
Sect. 7.1.2) not (concretely) computable.
We therefore define an N-standard partial algebra Rp on the reals, formed from
the total algebra Rt (Example 3.1.3) by replacing the total boolean-valued functions
eqR and lessR (Sect. 7.1.3, (5)) by the partial functions
(
" if x D y
eqR;p .x; y/ '
otherwise;
8
ˆ
ˆ if x < y
<
lessR;p .x; y/ ' if x > y
ˆ
:̂" if x D y:
These partial functions (unlike the total versions), are continuous, and hence Rp
(unlike Rt ) is a topological partial algebra. Moreover, these partial functions are
concretely computable (by e.g. the tracking model, cf. Sect. 7.1.1).
Then we have the question:
Can such continuous many-valued functions be computed on the abstract
data type A containing R using new abstract models of computation?
If so, are the concrete and abstract models equivalent?
The solution presented in [33] was to take A D RNp , the N-standard extension
of Rp , and then extend the While programming language over A [31] with a
nondeterministic “countable choice” programming construct, so that in the rules
of program term formation,
choose z W b
is a new term of type nat, where z is a variable of type nat and b a term
of type bool. In addition (calling the resulting language WhileCC for While
computability with countable choice), WhileCC computability is replaced by
WhileCC approximability [33, 34]. We then obtain a completeness theorem for
abstract/concrete computation, i.e. the equivalence (1) shown at the end of Sect. 2.
Generalizing Computability Theory to Abstract Algebras 153
Actually (1) was proved in [33] for N-standard metric algebras satisfying some
general conditions.
The above considerations lead us to propose the topological partial algebra Rp
as a better basis for abstract models of computation on R than the (total) algebra
Rt —better in the sense of being more faithful to the intuition of computing on the
reals.8
Remark 7.3.1 (Semantics of Partial Algebras) We briefly indicate the semantics for
terms and statements over partial algebras, or rather indicate how the semantics for
total algebras given in Sects. 3.2 and 4 can be adapted to partial algebras.
First, the semantics of terms is as given by the Eq. (2) in Sect. 3.2 (with the second
“D” replaced by “'”), using strict evaluation for partial functions (i.e., divergence
of any subterm entailing divergence of the term).9
Secondly, the semantics of statements is as given in Sect. 4; i.e., the value ŒŒSA
of a statement S at a state is the last state in a computation sequence (i.e. a
sequence of states) generated by S at , provided that the sequence is (well defined
and) finite. Otherwise (with an infinite computation sequence) the value diverges.
The case of partial algebras is similar, except that there are now two cases where the
value of ŒŒSA diverges: (1) (as before, global divergence) where the computation
sequence is infinite, and (2) (a new case, local divergence) where the computation
sequence is finite, but the last item diverges (instead of converging to a state) because
of a divergent term on the right of an assignment statement or a divergent boolean
test.
Remark 7.3.2 (Comparison of Formal Results for Rp and Rt ) It would be interest-
ing to see to what extent the results concerning abstract computing on the reals
with Rt detailed in Sects. 3–6 (for example, the merging and closure theorems
(Sect. 6.1) and comparisons of various notions of semicomputability and projective
semicomputability in Rt (Sect. 6.7) hold, or fail to hold, in Rp .10
It should be noted, in this regard, that the merging procedure used in our proofs of
the closure theorems (Theorems 3–5) depend heavily on the totality of the algebra A.
8
For another perspective on computing with total algebras on the reals, see [2].
9
As a general rule. For a case where boolean operators with non-strict semantics are appropriate,
see [40, Sect. 3].
10
Some striking results in this connection have been obtained by Mark Armstrong [1], e.g.
the failure of closure of While-semicomputable relations under union in Rp (cf. Theorem 4 in
Sect. 6.1).
154 J.V. Tucker and J.I. Zucker
putation was created [29] with the needs of equational and logical definability in
mind.
Axiomatic Methods. In an axiomatic method one defines the concept of a com-
putation theory as a set .A/ of partial functions on an algebra A having some of
the essential properties of the set of partial recursive functions on N. To take an
example, .A/ can be required to contain the basic algebraic operators and tests
of A; be closed under operations such as composition; and, in particular, possess an
enumeration for which appropriate universality and s-m-n properties are true. Thus
in Sect. 5 we saw that While .A/ is a computation theory in this sense.
The definition of a computation theory used here is due to Fenstad [7, 8] who
takes up ideas from Moschovakis [21]. Computation theory definitions typically
require a code set (such as N) to be part of the underlying structure A for the indexing
of functions.
The following fact is easily derived from [20] (where register machines are used);
see also Fenstad [8, Chap. 0].
Theorem 14 (Minimal Computation Theory) The set While .A/ of While com-
putable functions on an N-standard algebra A is the smallest set of partial functions
on A to satisfy the axioms of a computation theory; in consequence, While .A/ is a
subset of every computation theory .A/ on A.
We have sketched the elements of our work over four decades on generalizing
computability theory to abstract structures. A thorough exposition is to be found
11
In Feferman’s memorable slogan: “No calculation without representation.”
158 J.V. Tucker and J.I. Zucker
in our survey paper [31]. In [32] we have had the opportunity to recall the diverse
origins of, and influences on, our research programme.
Since [31], our research has emphasized computation on many-sorted topological
partial algebras (the focus of Sect. 7 here) and its diverse applications:
• computable analysis, especially on the reals [11, 30, 34],
• classical analog systems [14, 35–37],
• analog networks of discrete and continuous processors [25, 35],
• generalized stream processing in discrete and continuous time [36, 37].
These applications bring us close to an investigation of the physical foundation
of computability. In this regard, considerations of continuity are central (cf. the
discussion in Sect. 7.1.2). This is related to the issue of stability of analog systems,
and more broadly, to Hadamard’s principle [12] which, as (re-)formulated by
Courant and Hilbert [4, 13], states that for a scientific problem to be well posed,
the solution must exist, be unique and depend continuously on the data. To this we
might add: it must also be computable.
Acknowledgements We are grateful to Mark Armstrong, Sol Feferman, Diogo Poças and an
anonymous referee for very helpful comments on earlier drafts of this chapter. We also thank
the editors, Giovanni Sommaruga and Thomas Strahm, for the opportunity to participate in this
volume, and for their helpfulness during the preparation of this chapter.
This research was supported by a grant from the Natural Sciences and Engineering Research
Council (Canada).
References
10. A. Fröhlich, J. Shepherdson, Effective procedures in field theory. Philos. Trans. R. Soc. Lond.
(A) 248, 407–432 (1956)
11. M.Q. Fu, J.I. Zucker, Models of computability for partial functions on the reals. J. Log.
Algebraic Methods Program. 84, 218–237 (2015)
12. J. Hadamard, in Lectures on Cauchy’s Problem in Linear Partial Differential Equations (Dover,
New York, 1952). Translated from the French edition [1922]
13. J. Hadamard, La Théorie des Équations aux Dérivées Partielles (Éditions Scientifiques,
Warsaw, 1964)
14. N.D. James, J.I. Zucker, A class of contracting stream operators. Comput. J. 56, 15–33 (2013)
15. S.C. Kleene, Introduction to Metamathematics (North Holland, Amsterdam, 1952)
16. G. Kreisel, J.L. Krivine, Elements of Mathematical Logic (North Holland, Amsterdam, 1971)
17. G. Kreisel, D. Lacombe, J. Shoenfield, Partial recursive functions and effective operations,
in Constructivity in Mathematics: Proceedings of the Colloqium in Amsterdam, 1957, ed. by
A. Heyting (North Holland, Amsterdam, 1959), pp. 290–297
18. A.I. Mal’cev, Constructive algebras I, in The Metamathematics of Algebraic Systems. A.I.
Mal’cev, Collected papers: 1936–1967 (North Holland, Amsterdam, 1971), pp. 148–212
19. W. Magnus, A. Karass, D. Solitar, Combinatorial Group Theory (Dover, New York, 1976)
20. J. Moldestad, V. Stoltenberg-Hansen, J.V. Tucker, Finite algorithmic procedures and computa-
tion theories. Math. Scand. 46, 77–94 (1980)
21. Y.N. Moschovakis, Axioms for computation theories—first draft, in Logic Colloquium ‘69 ed.
by R.O. Gandy, C.E.M. Yates (North Holland, Amsterdam, 1971), pp. 199–255
22. M. Rabin, Computable algebra, general theory and the theory of computable fields. Trans. Am.
Math. Soc. 95, 341–360 (1960)
23. J.C. Shepherdson, Algebraic procedures, generalized Turing algorithms, and elementary
recursion theory, in Harvey Friedman’s Research on the Foundations of Mathematics, ed. by
L.A. Harrington, M.D. Morley, A. Ščedrov, S.G. Simpson (North Holland, Amsterdam, 1985),
pp. 285–308
24. V. Stoltenberg-Hansen, J.V. Tucker, Computable rings and fields, in Handbook of Computabil-
ity Theory, ed. by E. Griffor (Elsevier, Amsterdam, 1999)
25. B.C. Thompson, J.V. Tucker, J.I. Zucker, Unifying computers and dynamical systems using the
theory of synchronous concurrent algorithms. Appl. Math. Comput. 215, 1386–1403 (2009)
26. G.S. Tseitin, Algebraic operators in constructive complete separable metric spaces. Dokl.
Akad. Nauk SSSR 128, 49–52 (1959). In Russian
27. G.S. Tseitin, Algebraic operators in constructive metric spaces. Tr. Mat. Inst. Steklov 67, 295–
361 (1962); In Russian. Translated in AMS Translations (2) 64:1–80. MR 27#2406
28. J.V. Tucker, Computing in algebraic systems, in Recursion Theory, Its Generalisations and
Applications. London Mathematical Society Lecture Note Series, vol. 45, ed. by F.R. Drake,
S.S. Wainer (Cambridge University Press, Cambridge, 1980), pp. 215–235
29. J.V. Tucker, J.I. Zucker, Program Correctness Over Abstract Data Types, with Error-State
Semantics. CWI Monographs, vol. 6 (North Holland, Amsterdam, 1988)
30. J.V. Tucker, J.I. Zucker, Computation by ‘while’ programs on topological partial algebras.
Theor. Comput. Sci. 219, 379–420 (1999)
31. J.V. Tucker, J.I. Zucker, Computable functions and semicomputable sets on many-sorted
algebras, in Handbook of Logic in Computer Science, vol. 5, ed. by S. Abramsky, D. Gabbay,
T. Maibaum (Oxford University Press, Oxford, 2000), pp. 317–523
32. J.V. Tucker, J.I. Zucker, Origins of our theory of computation on abstract data types at the
Mathematical Centre, Amsterdam, 1979–1980, in Liber Amicorum: Jaco de Bakker, ed. by
F. de Boer, M. van der Heijden, P. Klint, J.J.M.M. Rutten (Centrum Wiskunde & Informatica,
Amsterdam, 2002), pp. 197–221
33. J.V. Tucker, J.I. Zucker, Abstract versus concrete computation on metric partial algebras. ACM
Trans. Comput. Log. 5, 611–668 (2004)
34. J.V. Tucker, J.I. Zucker, Computable total functions, algebraic specifications and dynamical
systems. J. Log. Algebraic Program. 62, 71–108 (2005)
160 J.V. Tucker and J.I. Zucker
35. J.V. Tucker, J.I. Zucker, Computability of analog networks. Theor. Comput. Sci. 371, 115–146
(2007)
36. J.V. Tucker, J.I. Zucker, Continuity of operators on continuous and discrete time streams.
Theor. Comput. Sci. 412, 3378–3403 (2011)
37. J.V. Tucker, J.I. Zucker, Computability of operators on continuous and discrete time streams.
Computability 3, 9–44 (2014)
38. A.M. Turing, On computable numbers, with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. 42, 230–265 (1936). With correction, ibid., 43, 544–546 (1937). Reprinted
in The Undecidable, ed. by M. Davis (Raven Press, New York, 1965)
39. K. Weihrauch, Computable Analysis: An Introduction (Springer, Berlin, 2000)
40. B. Xie, M.Q. Fu, J. Zucker, Characterizations of semicomputable sets of real numbers. J. Log.
Algebraic Methods Program. 84, 124–154 (2015)
Discrete Transfinite Computation
P.D. Welch
Abstract We describe various computational models based initially, but not exclu-
sively, on that of the Turing machine, that are generalized to allow for transfinitely
many computational steps. Variants of such machines are considered that have
longer tapes than the standard model, or that work on ordinals rather than numbers.
We outline the connections between such models and the older theories of recursion
in higher types, generalized recursion theory, and recursion on ordinals such as
˛-recursion. We conclude that, in particular, polynomial time computation on
!-strings is well modelled by several convergent conceptions.
1 Introduction
We shall see that the various models link into several areas of modern logic:
besides recursion theory, set theory and the study of subsystems of second order
analysis play a role. Questions arise concerning the strengths of models that operate
at the level one type above that of the integers. This may be one of ordinal types: how
long a well ordered sequence of steps must a machine undertake in order to deliver
its output? Or it may be of possible output: if a machine produces real numbers,
which ordinals can be coded as output reals? And so on and so forth.
Running such a procedure on a Turing machine allows us to print out a 02 set
A’s characteristic function on the output tape. In order to do this we are forced to
allow the machine to change its mind about n 2 A and so repeatedly substitute a
0 for a 1 or vice versa in the n’th cell of the output tape. However, and this is the
point, at most finitely many changes are to be made to that particular cell’s value.
It is this feature of not knowing at any finite given time whether further alterations
are to made, that makes this a transition from a computable set to a non-computable
one.
By a recursive division of the working area up into infinitely many infinite pieces,
one can arrange for the correct computation of all ?m 2 A‹ to be done on the one
machine, and the correct values placed on the output tape.
However this is as far as one can go if one imposes the (very obvious, practical)
rule that a cell’s value can only be altered finitely often. In order to get a †02
set’s characteristic function written to the output tape, then in general one cannot
guarantee that a cell’s value is changed finitely often. Then immediately one is in
the hazardous arena of supertasks.
Nevertheless let us play the mathematicians’ game of generalizing for gener-
alization’s sake: let us by fiat declare a cell’s value that has switched infinitely
often 0 ! 1 ! 0 to be 0 at “time !”. With this lim inf declaration one has,
mathematically at least, written down the †02 -set on the output tape, again at time !.
164 P.D. Welch
Ci ./ D k if 9˛ < 8ˇ < .˛ < ˇ ! Ci .ˇ/ D k/ for k 2 f0; 1gI
D 0 otherwise.
The R/W head we place according to the above, also using a modified Liminf
rule:
This is not exactly the arrangement that Hamkins and Lewis specified in [11] but
it is inessentially different from it (HL specified a special limit state q which the
machine entered into automatically at limit stages, and the head was always set back
to the start of the tape. They specified (which we shall keep here) that the machine
be a three tape machine).
Input then can consist of a set of integers, suitably coded as an element of 2
on hC3i ii and output likewise is such an element on hC3iC2 ii . Thus there is little
difference in a machine with an oracle Z ! and one acting on input Z coded
onto the input tape. However we immediately see the possibility of higher type
computation: we may have some Z 2 and then we add a query state which asks
if, say, the scratch tape hC3iC1 ii ’s contents is or is not an element of Z.
Discrete Transfinite Computation 165
We have thus completely specified the ITTM’s behaviour. The scene is thus set
to ask what such machines are capable of. We defer discussion of this until Sect. 2,
whilst we outline the rest of this chapter here.
In one sense we have here a logician’s plaything: the Turing model has been
taken over and redesigned with a heavy-handed liminf rule of behaviour. This liminf
operation at limit stages is almost tantamount to an infinitary logical rule, and most
of the behaviour the machine exhibits is traceable to this rule. But then of course
it has to be, what else is there? Nevertheless this model and those that have been
studied subsequently have a number of connections or aspects with other areas of
logic. Firstly, with weak subsystems of analysis: it is immediately clear that the
behaviour of such machines is dependent on what ordinals are available. A machine
may now halt at some transfinite stage, or may enter an infinitely repeating loop;
but any theory that seeks to describe such machines fully is a theory which implies
the existence of sufficiently long wellorderings along which such a machine can run
(or be simulated as running). We may thus ask “What (sub-) system of analysis is
needed in which to discuss such a machine”? We shall see that machine variants
may require longer or shorter wellorderings, thus their theory can be discussed
within different subsystems.
Secondly, we can ask how the computable functions/sets of such a model fit in
with the earlier theories of generalized recursion theory of the 1960s and 1970s.
For example there is naturally associated with ITTM’s a so-called Spector Class of
sets. Such classes arise canonically in the generalized recursion theories of that era
through notions of definability.
Once one model has been defined it is very tempting to define variants. One
such is the Infinite Time Register Machine (ITRM’s—due to Koepke [23]) which
essentially does for Shepherdson-Sturgis machines what HL does for Turing
machines. Whilst at the finite level these two models are equal in power, their
infinitary versions differ considerably, the ITTM’s being much stronger. The ITRM
model is discussed in Sect. 3.
Just as for ordinary recursion on ! the TM model with a putative tape in
order type ! length is used, so when considering notions of ˛-recursion theory
for admissible ordinals ˛, it is possible to think of tapes also unfettered by having
finite initial segments: we may consider machines with tapes of order type ˛ and
think of computing along ˛ with such machines. What is the relation to this kind of
computation and ˛-recursion theory?
One can contemplate even machines with an On-length tape. It turns out
(Koepke [22]) that this delivers a rather nice presentation of Gödel’s constructible
hierarchy. Finally discussed here is the notion of a Blum-Shub-Smale machine
([3]) acting transfinitely. With some continuity requirement imposed on register
contents for limit times, we see that functions such as exponentiation ex which
are not BSS computable, become naturally IBBS computable. Moreover there is
a nice equivalence between their decidable reals, and those produced by the Safe Set
Recursion (“SSR”) of Beckmann, Buss, and S. Friedman, which can be thought of
as generalizing to transfinite sets notions of polynomial time computable functions
on integers. Briefly put, a polynomial time algorithm using ! as an input string,
166 P.D. Welch
should be halting by some time ! n for some finite n. The IBBS computable reals
are then identical to the SSR-computable reals. The background second order theory
needed to run IBBS machines lies intermediate between WKL0 and ATR0 .
The relation of ITTM’s to Kleene recursion is discussed in Sect. 2.
Hamkins and Lewis in [11] explore at length the properties of ITTM’s: they
demonstrate the natural notion of a universal such machine, and hence an Smn -
theorem and the Recursion Theorems. A number of questions immediately spring
to mind:
Q. What is the halting set H D fe 2 ! j Pe .0/#g?
Here hPe ie enumerates the usual Turing machine programs/transition tables (and
we use Pe .x/#y to denote that the e’th program on input x 2 or in 2 halts
with output y. (If we are unconcerned about the y we omit reference to it.) An ITTM
computation such as this can now halt in ! or more many steps. But how long should
we wait to see if Pe .0/#or not? This is behind the following definitions.
Definition 2
(i) We write “Pe .n/#˛ y” if Pe .n/#y in exactly ˛ steps. We call ˛ clockable if 9e9n 2
!9y Pe .n/#˛ y.
(ii) A real y 2 2 is writable if there are n; e 2 ! with Pe .n/#y; an ordinal ˇ is
called writable, if ˇ has a writable code y.
We may consider a triple s.˛/ D hl.˛/; q.˛/; hCi .˛/i ii as a snapshot of a
machine at time ˛, which contains all the relevant information at that moment. A
computation is then given by a wellordered sequence of snapshots. There are two
possible outcomes: there is some time ˛ at which the computation halts, or else
there must be some stage ˛0 at which the computation enters the beginning of a
loop, and from then on throughout the ordinals it must iterate through this loop. It is
easy either by elementary arguments or simply by Löwenheim-Skolem, to see that
such an ˛0 must be a countable ordinal, and moreover that the periodicity of the
cycling loop is likewise countable.
The property of being a “well-ordered sequence of snapshots in the computation
Pe .x/” is …11 as a relation of e and x. Hence “Pe .x/#y” is 12 :
9w.w codes a halting computation of Pe .x/, with y written on the output tape at
the final stage/ ()
8w.w codes a computation of Pe .x/ that is either halting or performs a repeating
infinite loop ! w codes a halting computation with y on the output tape:/
Likewise Pe .x/ " is also 12 . By the above discussion then it is immediate that the
clockable and writable ordinals are all countable. Let Ddf supf˛ j ˛ is writableg;
let
Ddf supf˛ j ˛ is clockableg. Hamkins-Lewis showed that
.
Q2 Is D
?
Discrete Transfinite Computation 167
Definition 3
(i) xr D fejPe.x/ #g (The halting set on integers).
(ii) X H D f.e; y/jPXe .y/ #g (The halting set on reals relativised to X 2 ).
This yields the halting sets, both for computations on integers and secondly on
reals where by the latter we include the instruction for the ITTM to query whether
the current scratch tape’s contents considered as a real, is in X.
Definition 4
(i) R.x/ is an ITTM-semi-decidable predicate if there is an index e so that:
8x.R.x/ $ Pe .x/#1/
If we unpack the contents here, answers to our questions are given by (iii) and
(iv). Let us take x D ¿ so that we may consider the unrelativised case. Our
machine-theoretic structure and operations are highly absolute and it is clear that
running the machine inside the constructible hierarchy of L˛ ’s yields the same
snapshot sequence as considering running the machine in V. If Pe .n/# then this
is a †1 -statement (in the language of set theory). As halting is merely a very special
case of stabilization, then we have that
(the latter because L †1 L ). Hence the computation must halt before . Hence
the answer to Q2 is affirmative: every halting time (of an integer computation) is a
writable ordinal. One quickly sees that a set of integers is ITTM-decidable if and
only if it is an element of L . It is ITTM-semi-decidable if and only if it is †1 .L /.
Since the limit rules for ITTM’s are intrinsically of a †2 -nature, with hindsight it
is perhaps not surprising that this would feature in the .; †/ pair arising as they do:
after all the snapshot of the universal ITTM at time is going to be coded into the
†2 -Theory of this L . The universality of the machine is then apparent in the fact
that by stage it will have “constructed” all the constructible sets in L .
Discrete Transfinite Computation 169
O OC O1
:
L! ck L L
1
0 if 9nx.n/ D 0
E.x/ D
1 otherwise.
The reason for this was, although for any oracle I the class of relations
semi-decidable in I was closed under 8 quantification, when semi-decidable
additionally in E it becomes closed under 9 quantification. The Kleene semi-
decidable sets then would include the arithmetic sets in 2 (or further products
thereof). (Ensuring computations be relative to E also guarantees that we have the
Ordinal Comparison Theorem.)
The decidable relations turn out to be the hyperarithmetic ones, and the semi-
decidable are those Kleene-reducible to WO, the latter being a complete …11 set of
reals. Thus:
Theorem 3 (Kleene) The hyperarithmetic relations R.En, Ex/ k . /l for any
k; l 2 , are precisely those computable in E.
The …11 relations are precisely those semi-computable in E.
Then a reducibility ordering comes from:
Definition 6 (Kleene Reducibility) Let A; B ; we say that A is Kleene-semi-
computable in B iff there is an index e and y 2 so that
ordinal not (ordinary) Turing recursive in x) satisfies the Spector Criterion ./ above.
For sets of reals B we may extend this notation and let !1B;x ck be the ordinal height
˛ of the least model of KP set theory (so the least admissible set) of the form
L˛ Œx; B ˆ KP.
With this we may express A K B as follows:
Lemma 1 A K B iff there are †1 -formulae in L2;XP '1 .X;
P v0 ; v1 /; '2 .X;
P v0 ; v1 /,
and there is y 2 , so that
Back to ITTM-semidecidability:
The notion of semi-decidability comes in two forms.
Definition 7
(i) A set of integers x is semi-decidable in a set y if and only if:
9e8n 2 x Pye .n/ # 1 !n2x
(ii) A set of integers x is decidable in a set y if and only if both x and its complement
is semi-decidable in y. We write x 1 y for the reducibility ordering.
(iii) A set of integers x is eventually-(semi)-decidable in a set y if and only if the
above holds with " replacing#. For this reducibility ordering we write x 1 y.
We then get the analogue of the Spector criterion using xr as the jump operator:
Lemma 2
(i) The assignment x x satisfies the Spector Criterion:
x 1 y ! .xr 1 y $ x < y /:
x 1 y ! .x1 1 y $ x < y /
One can treat the above as confirmation that the ITTM degrees and jump
operation are more akin to hyperarithmetic degrees and the hyperjump, than to the
172 P.D. Welch
(standard) Turing degrees and Turing jump. Indeed they are intermediate between
hyperdegrees and 12 -degrees.
To see this, we define a notion of degree using definability and Turing-invariant
functions on reals (by the latter we mean a function f W ! !1 such that
x T y ! f .x/ f .y/). Now assume that f is †1 -definable over .HC; 2/ without
parameters, by a formula in L2P .
Definition 8 Let f be as described; let ˆ be a class of formulae of L2P . Then
D
f ;ˆ is the pointclass of sets of reals A so that A 2
if and only if there is ' 2 ˆ
with:
With the function f .x/ D !1x ck and ˆ as the class of †1 -formulae we have that
f ;ˆ coincides with the …11 -sets of reals (by the Spector-Gandy Theorem). Replacing
f with the function g.x/ D x then yields the (lightface) ITTM-semi-decidable sets.
Lemma 1 is then the relativisation of Kleene recursion which yields the relation
A K B.
We now make the obvious definition:
Definition 9
(i) A set of reals A is semi-decidable in a set of reals B if and only if:
9e8x 2 2 PBe .x/ # 1 $ x 2 A
(ii) A set of reals A is decidable in a set of reals B if and only if both A and its
complement is semi-decidable in B.
(iii) If in the above we replace # everywhere by " then we obtain the notion in (i)
of A is eventually decidable in B and in (ii) of A is eventually semi-decidable
in B.
Then the following reducibility generalizes that of Kleene recursion.
Definition 10
(i) A 1 B iff for some e 2 !, for some y 2 2 W A is decidable in .y; B/.
(ii) A 1 B iff for some e 2 !, for some y 2 2 W A is eventually decidable in
.y; B/.
Again a real parameter has been included here in order to have degrees
closed under continuous pre-images. We should expect that these reducibilities are
dependent on the ambient set theory, just as they are for Kleene degrees: under
V D L there are many incomparable degrees below that of the complete semi-
decidable degree, and under sufficient determinacy there will be no intermediate
degrees between the latter and 0, and overall the degrees will be wellordered. Now
we get the promised analogy lifting Lemma 1, again generalizing in two ways
depending on the reducibility.
Discrete Transfinite Computation 173
Lemma 3
(i) A 1 B iff there are †1 -formulae in L2;XP '1 .X;
P v0 ; v1 /; '2 .X;
P v0 ; v1 /, and y 2
, so that
We have not formally defined all the terms here: B;y;x is the supremum of the
ordinals written by Turing programs acting transfinitely with oracles for B; y. The
ordinal B;y;x is the least that is not ITTM-.B; x; y/-eventually-semi-decidable. There
is a corresponding --†-theorem and thus we have also that this is least such that
L B;y;x ŒB; y; x has a proper †2 -elementary end-extension in the LŒB; y; x hierarchy.
the degree analogy here should be pursued with hyperdegrees rather than Turing
degrees. It is possible to iterate the jump hierarchy through the D1 -degrees, and
one finds that, inside L, the first -iterations form a linearly ordered hierarchy with
least upper bounds at limit stages. We emphasise this as being inside L since one
can show that there is no least upper bound to f0rn j n < !g, but rather continuum
many minimal upper bounds (see [37]). We don’t itemize these results here but refer
the reader instead to [38].
A more general but basic open question is:
Q If D D fdn W n < !g is a countable set of D1 -degrees, does D have a minimal
upper bound?
The background to this question is varied: for hyperdegrees this is also an
open question. Under Projective Determinacy a positive answer is known for 12n -
degrees, but for 12nC1 -degrees this is open, even under PD. Minimal infinite time
1-degrees can be shown to exist by similar methods, using perfect set forcing, to
those of Sacks for minimal hyperdegrees (again see [37]).
One can also ask at this point about the nature of Post’s problem for semi-
decidable sets of integers. By the hyperdegree analogy one does not expect there
to be incomparable such sets below 0r and indeed this turns out to be the case [12].
174 P.D. Welch
to workers in the latter area around 2000, until Benedikt Löwe pointed out [27]
the similarity between the Herzberger revision sequence formalism and that of
the machines. It can be easily seen that any Herzberger sequence with starting
distribution of truth values x say, can be mimicked on an ITTM with input x.
Thus this is one way of seeing that Herzberger sequences must have a stability
pair lexicographically no later than .; †/. Burgess had shown that H-sequences
then loop at no earlier pair of points. More recently Field [7] has used a revision
theoretic definition with a …11 -quasi-inductive operator to define a variant theory
of truth. For all three formalisms, Fields, Burgess’s AQI, and ITTM’s, although
differing considerably in theory, the operators are all essentially equivalent as is
shown in [40], since they produce recursively isomorphic stable sets. The moral to
be drawn from this is that in essence the strength of the liminf rule is at play here,
and seems to swamp all else.
Several questions readily occur once one has formulated the ITTM model. Were
any features chosen crucial to the resulting class of computable functions? Do
variant machines produce different classes? Is it necessary to have three tapes in the
machine? The answer for the latter question is both yes and no. First the affirmative
part: it was shown in [14] the class of functions f W ! remains the same if 3
tapes are replaced by 1, but not the class of functions f W 2 ! 2 . The difficulty
is somewhat arcane: one may simulate a 3-tape machine on a 1-tape machine, but to
finally produce the output on the single tape and halt, some device is needed to tell
the machine when to finish compacting the result down on the single tape, and they
show that this cannot be coded on a 1-tape machine. On the other hand [39] shows
that if one adopts an alphabet of three symbols this can be done and the class of
functions f W 2 ! 2 is then the same. One may also consider a B for “Blank”
as the third symbol, and change the liminf rule so that if cell Ci has varied cofinally
in a limit ordinal , then Ci ./ is set to be blank (thus nodding towards ambiguity
of the cell value). With this alphabet and liminf rule a 1-tape machine computes the
same classes as a 3-tape machine, and these are both the same as computed by the
original ITTM.
What of the liminf rule itself? We have just mentioned a variant in the last
paragraph. Our original liminf rule is essentially of a †2 nature: a value of 1 is in a
cell Ci ./ at limit time if there is an ˛ < such that for all ˇ 2 .˛; ˇ/ Ci .ˇ/ D 1.
Running a machine inside L one sees that the snapshot s./ is a predicate that
is †2 -definable over L . It was observed in [38] that the liminf rule is complete
for all other rules †2 -definable over limit levels L in that for any other such rule
the stability set obtained for the universal machine (on 0 input) with such a rule is
(1–1) †2 -definable over L and thus is (1–1) in the †2 -truth set for L . However
the latter is recursively isomorphic to the stability set for the universal ITTM by
Corollary 1 and hence the standard stability set subsumes that of another machine
176 P.D. Welch
with a different †2 -rule. Given the †2 nature of the limit rule, with hindsight one
sees that it is obvious that with . 0 ; †0 / defined to be the lexicographically least pair
with L 0 †2 L†0 , then we must have that the universal ITTM enters a loop at 0 .
That it cannot enter earlier of course is the --†-Theorem, but a vivid way to see
that this is the case is afforded by the construction in [8] which demonstrated that
there was a non-halting ITTM program producing on its output tape continually sets
of integers that coded levels L˛ of the constructible hierarchy for ever larger ˛ below
†; at stage † it would perforce produce the code for L and then forever cycle round
this loop producing codes for levels ˛ 2 Œ; †/.
More complex rules lead to more complex machines. These were dubbed
“hypermachines” in [9], where a machine was defined with a †3 -limit rule, and
this was shown to be able to compute codes for L˛ for ˛ < †.3/, where now
.3/ < †.3/ was the lexicographically least pair with L.3/ †3 L†.3/ . The stability
set was now that from the snapshot at stage .3/ and was (1–1) to the †3 -truth
set for this level of L. Inductively then one defines †4 ; †5 ; : : : ; †n ; : : : limit rules
with the analogous properties. I think it has to be said though that the definitions
become increasingly complex and even for n D 3, mirror more the structure of
L in these regions with its own “stable ordinals” rather than anything machine-
inspired. With these constructions one can then “compute” any real that is in L
where D supn .n/.
Generalizations of the ITTM machine are possible in different directions. One can
consider machines with tapes not of cells of order type ! but of longer types. Some
modifications are needed: what do we do if the program asks the R/W head to move
one step leftwards when hovering over a cell C for a limit ordinal? There are
some inessentially different choices to be made which we do not catalogue here, but
assume some fixed choices have been made.
We consider first the extreme possibility that the tape is of length On, that is
of the class of all ordinals. We now have the possibility that arbitrary sets may
be computed by such machines. Independently Dawson and Koepke came up with
this concept. There are some caveats: how do we know that we can “code” sets by
transfinite strings of 0; 1’s at all? Dawson [6] formulated an Axiom of Computability
that said every set could appear coded on the output tape of such a machine at
some stage whilst it was running; thus for any set z there would be a program
number e with Pe (not necessarily halting) with a code for z appearing on its
output tape. He then argued that the class of such sets is a model of ZFC, and
by studying the two dimensional grid of snapshots produced a Löwenheim-Skolem
type argument to justify that the Axiom of Computability implied the Generalized
Continuum Hypothesis. That the class of computable sets satisfied AC falls out
of the assumption that sets can be coded by strings and such can be ordered.
Since this machine’s operations are again very absolute, it may be run inside L,
Discrete Transfinite Computation 177
thus demonstrating that “computable sets” are nothing other than the constructible
sets. Koepke in [21] and later with Koerwien in [22] considered instead halting
computations starting with an On-length tape marked with finitely many 1’s in
certain ordinal positions .n; 1 ; : : : ; n /, and asked for a computation as to whether
.'n .1 ; : : : n1 //Ln was true. Thus the machine was capable of computing a truth
predicate for L. This leads to:
Theorem 5 (Koepke [21]) A set x On is On-ITTM-computable from a finite set
of ordinal parameters if and only if it is a member of the constructible hierarchy.
One might well ask whether the computational approach to L might lead to
some new proofs, or at least new information, on some of the deeper fine structural
and combinatorial properties of L. However this hope turned out to be seemingly
thwarted by the †2 -nature of the limit rule. Fine structural arguments are very
sensitive to definability issues, and in constructions such as that for Jensen’s
principle, say, we need to know when or how ordinals are singularised for any n
including n D 1 and the limit rule works against this. Moreover alternatives such as
the Silver Machine model which was specifically designed to by-pass Jensen’s fine
structural analysis of L, make heavy use of a Finiteness Property that everything
appearing at a successor stage can be defined from the previous stages and a finite
set of parameters; just does not seem to work for On-ITTM’s.
However this does bring to the fore the question of shortening the tapes to some
admissible ordinal length ˛ > ! say, and asking what are the relations between
˛-ITTM’s and the ˛-recursion theory developed in the late 1960s and early 70s.
The definitions of that theory included that a set A ˛ which is †1 .L˛ / was called
˛-recursively enumerable (˛-r.e.). It was ˛-recursive if both it and its complement
is ˛-r.e. and thus is 1 .L˛ /. A notion of relative ˛-recursion was defined but then
noticed to be intransitive; a stronger notion was defined and denoted by A ˛ B.
Koepke and Seyfferth in [24] define A is computable in B to mean that the
characteristic function of A can be computed by a machine in ˛ many stages from an
oracle for B. This is exactly the relation that A 2 1 .L˛ ŒB/. This has the advantage
that the notion of ˛-computability and the associated ˛-computable enumerability
(˛-c.e.) tie up exactly with the notions of ˛-recursiveness and ˛-r.e. They then
reprove the Sacks-Simpson theorem for solving Post’s problem: namely that there
can be two ˛-c.e. sets neither of which are mutually computable in their sense from
the other.
However the relation “is computable in” again suffers from being an intransitive
one. Dawson defines the notion of ˛-sequential computation that requires the output
to the ˛-length tape be written in sequence without revisions. This gives him a
transitive notion of relative computability: a set is ˛-computable if and only if it
is ˛-recursive, and it is ˛-computably enumerable if and only if it is both ˛-r.e.
and regular. Since Sacks had shown [31] that any ˛-degree of ˛.-r.e. sets contains
a regular set, he then has that the structure of the ˛-degrees of the ˛-r.e. sets in the
classical, former, sense, is isomorphic to that of the ˛-degrees of the ˛-c.e. sets.
This implies that theorems of classical ˛-recursion theory about ˛-r.e. sets whose
proofs rely on, or use regular ˛-r.e. sets will carry over to his theory. This includes
178 P.D. Welch
the Sacks-Simpson result alluded to. The Shore Splitting Theorem [34] which states
that any regular ˛-r.e. set A may be split into two disjoint ˛-r.e. sets B0 ; B1 with
A —˛ Bi , is less amenable to this kind of argument but with some work the Shore
Density theorem [35] that between any two ˛-r.e. sets A <˛ B there lies a third
˛-r.e. C: A <˛ C <˛ B can be achieved. As Sacks states in his book, the latter
proof seems more bound up with the finer structure of the constructible sets than
the other ˛-recursion theory proofs. Dawson generalizes this by lifting his notion
of ˛-computation to that of a -˛-computation where now Ddf hJ˛ ; 2; i is an
admissible, acceptable, and sound structure for a ˛. These assumptions make
J˛ sufficiently L-like to rework the Shore argument to obtain:
Theorem 6 (Dawson—The ’-c.e. Density Theorem) Let be as above. Let A; B
be two -˛-c.e. sets, with A <;˛ B. Then there is C also -˛-c.e., with A <;˛
C <;˛ B.
Once the step has been taken to investigate ITTM’s, one starts looking at other
machine models and sending them into the transfinite. We look here at Infinite
Time Register Machines (ITRM’s) both with integer and ordinal registers, and lastly
comment on Infinite Time Blum-Shub-Smale Machines (IBSSM’s).
Ri ./ Ddf lim inf Ri .˛/ if this is finite; otherwise we set Ri ./ D 0:
˛!
Although perhaps not apparent at this point, it is this “resetting to zero” that
gives the ITRM its surprising strength: specifying that the machine, or program,
Discrete Transfinite Computation 179
will have already applied and the computation if still running will be looping. One
then shows by induction that each extra register added to the architecture requires a
further admissible ordinal in run time to guarantee looping behaviour. One then thus
arrives at the property that any ordinal below !!ck —the first limit of admissibles, is
clockable by such an ITRM, and thence that the halting sets Hn can be computed by
a large enough device. We can state this more formally:
Thus the assertion that these machines either halt or exhibit looping behaviour
turns out to be equivalent to a well known subsystem of second order number theory,
namely, …11 -CA0 . Let ITRMN be the assertion: “The N-register halting set HN
exists.” Further, let ITRM be the similar relativized statement that “For any Z !,
for any N < ! the N-register halting set HNZ exists.” Then more precisely:
Theorem 8 (Koepke-Welch [26])
(i) …11 -CA0 ` ITRM. In particular:
KPC“there exist N C 1 admissible ordinals > !” ` ITRMN .
(ii) ATR0 C ITRM ` …11 -CA0 .
In particular there is a fixed k < ! so that for any N < !
We mention finally here the notion studied by Koepke and Siders of Ordinal Register
Machines (ORM’s [25]): essentially these are the devices above but extended to
have ordinal valued registers. Platek (in private correspondence) indicated that he
had originally considered his equational calculus on recursive ordinals as being
implementable on some kind of ordinal register machine. Siders also had been
thinking of such machines and in a series of papers with Koepke considered the
unbounded ordinal model. The resetting Liminf rule is abandoned, and natural
Liminf’s are taken. Now ordinal arithmetic can be performed. Remarkably given
the paucity of resources apparently available one has the similar theorem to that of
the On-ITTM:
Theorem 9 (Koepke-Siders [25]) A set x On is ORM-computable from a finite
set of ordinals parameters if and only if it is a member of the constructible hierarchy.
They implement an algorithm that computes the truth predicate T On for L
and which is ORM-computable on a 12 register machine (even remarking that this
can be reduced to 4!). From T a class of sets S can be computed which is a model
of their theory SO, which is indeed the constructible hierarchy.
Discrete Transfinite Computation 181
computation that halts in polynomial time from !—the length of the input. Hence
the calculation should halt by some ! n for an n < !. They have:
Theorem 10 ([1]) Let f be any SRSF. Then there is a ordinal polynomial qf in
variables ˛E so that
E
E maxi rk.ai / C qf .rk.a//:
rk. f .Ea=b//
Thus the typing of the variables ensures that the ranks of sets computed as outputs
from an application of an SRSFunction are polynomially bounded in the ranks of
the input. Using an adaptation of Arai, such functions on finite strings correspond to
polynomial time functions in the ordinary sense. For !-strings we have that such
computations halt by a time polynomial in !. As mentioned by Schindler, it is
natural to define “polynomial time” for ITTM’s to be those calculations that halt
by stage ! ! , and a polynomial time ITTM function to be one that, for some N < !,
terminates on all inputs by time ! N . We thus have:
Theorem 11 The following classes of functions of the form F W .2 /k ! 2 are
extensionally equivalent:
(I) Those functions computed by a continuous IBSSM machine;
(II) Those functions that are polynomial time ITTM;
(III) Those functions that are safe recursive set functions.
Proof We take k D 1. We just sketch the ideas and the reader may fill in the details.
By Koepke-Seyfferth for any IBSSM computable function there is N < ! so that the
function is computable in less than ! N steps. We may thus consider that computation
to be performed inside L! N Œx and so potentially simulable in polynomial time (in
! M steps, for some M) by an ITTM. However this can be realised: a code for
any L˛ Œx for ˛ ! N , x 2 2 , and its theory, may be computable by an ITTM
(uniformly in the input x) by time ! NC3 by the argument of Lemma 2 of Friedman
and Welch [8] (Friedman-Welch). Since we have the theory, we have the digits of
the final halting IBSSM-output (or otherwise the fact that it is looping or has crashed
respectively, since these are also part of the set theoretical truths of L! N Œx). Thus
(II) (I). If F is in the class (II), then for some N < !, F.x/ is computable within
L! N Œx and by setting up the definition of the ITTM program P computing F we
may define some ˛ such that the output of that program P on x (i.e. F.x/) is the ˛’th
element of 2 in L! N Œx uniformly in x. However the set L! N Œx is SRSF-recursive
from ! [ fxg (again uniformly in x) as is a code for ˛. This yields the conclusion
that we may find uniformly the output of P.x/ using the code for ˛, again as the
output of an SRSF-recursive-in-x function. This renders (II) (III).
Finally if F is in (III), (and we shall assume that the variable x is in a safe variable
place—but actually the case where there are normal and safe variables is handled no
differently here) then there is (cf. [1], 3.5) a finite N and a †1 -formula '.v0 ; v1 / so
that F.x/ D z iff L! N Œx ˆ 'Œx; z (using here that TC.x/ D ! and thus rk.x/ D !).
Indeed we may assume that z is named by the canonical †1 -Skolem function h
for, say, L! N C! Œx as h.i; n/ for some n < !. Putting this together we have some
Discrete Transfinite Computation 183
†1 .v0 / (in Lx;P 2P ) so that F.x/.k/ D z.k/ D 1 iff L! N C! Œx ˆ Œk. In short to
be able to determine such F.x/ by an IBSSM it suffices to be able to compute the
†1 -truth sets for L˛ Œx for all ˛ < ! ! by IBSSM’s. There are a variety of ways one
could do this, but it is well known that calculating the ˛’th iterates of the Turing
jump relativised to x for ˛ < ! ! would suffice. To simplify notation we shall let x
also denote the set of integers in the infinite fractional expansion of the real x. So fix
0
a k < !, to see that we may calculate x.ˇ/ for ˇ < ! k . One first constructs a counter
to be used in general iterative processes, using registers C0 ; : : : Ck1 say, whose
contents represent the integer coefficients in the Cantor normal form of ˇ < ! k
where we are at the ˇ’th stage in the process. (The counter of course must conform
to the requirement that registers are continuous at limits ! k . This can be devised
using reciprocals and repeated division by 2 rather than incrementation by 1 each
time.) We assume this has been done so that in particular that C0 D C1 D
D
Ck1 D 0 occurs first at stage ! k . We then code the characteristic function of fm 2
.ˇ/0
! j m 2 Wmx g as 1=0’s in the digits at the s’th-places after the decimal point of
R1 where s is of the form pkCm :pn00 C1 :
:pk1
nk1 C1
where p0 D 2; p1 D 3, etc.,
enumerates the primes, and nj the exponent of ! j in the Cantor Normal form of ! ˇ .
For limit stages < ! k , continuity of the register contents automatically ensures
0
that this real in R1 also codes the disjoint union of the x.ˇ/ for ˇ < , and at stage
! k we have the whole sequence of jumps encoded as required. Q.E.D.
6 Conclusions
The avenues of generalization of the Turing machine model into the transfinite
which we have surveyed, give rise to differing perspectives and a wealth of
connections. Higher type recursion theory, to which the models mostly nearly
approximate, to a lesser or greater extent, was a product of Kleene’s generalization
of the notion of an equational calculus approach to recursive functions. Here
discussed are machines more on the Turing side of the balance. Some of the other
generalizations of recursion theory, say to meta-recursion theory, as advocated by
Kreisel and elucidated by Sacks and his school, and which later became ordinal ˛-
recursion theory, we have not really discussed here in great detail, but again their
motivations came from the recursion theoretic-side, rather than any “computational-
model-theoretic” direction. The models discussed in this chapter thus fill a gap in
our thinking.
Referring to the last section, we find that, rather like a Church’s thesis, we have
here an effective system for handling !-strings in polynomial time, as formalized by
the SRSF’s, and a natural corresponding computational model of ITTM’s working
with calculations halting by time earlier than ! ! . The model of computation with
the continuous limit IBSSM’s then also computes the same functions. Note that
assertions such as that “every continuous IBSSM halts, loops, or becomes discontin-
uous” when formalized in second order arithmetic, are intermediate between ACA0
184 P.D. Welch
and ATR0 . There is much to be said for the IBSSM model over its finite version: we
have remarked that the infinite version calculates power series functions, such as sin,
ex . With a little work one sees also that if any differentiable function f W !
is IBSSM computable, then so is its derivative f 0 .
On the other hand the class of sets that ITTM’s compute form a Spector class,
and so we can bring to bear general results about such classes on the ITTM semi-
decidable, and eventually semi-decidable classes; their strength we saw was very
strong: between …12 -CA0 and …13 -CA0 . Finally the On-tape version of the ITTM,
gives us a new presentation of the constructible hierarchy as laid out by an ordinary
Turing program progressing throughout On time.
References
1. A. Beckmann, S. Buss, S.-D. Friedman, Safe Recursive Set Functions (Centre de Recerca
Matematica Document Series, Barcelona, 2012)
2. S. Bellantoni, S. Cook, A new recursion-theoretic characterization of the poly-time functions.
Comput. Complex. 2, 97–110 (1992)
3. L. Blum, M. Shub, S. Smale, On a theory of computation and complexity over the real numbers.
Not. Am. Math. Soc. 21(1), 1–46 (1989)
4. J.P. Burgess, The truth is never simple. J. Symb. Log. 51(3), 663–681 (1986)
5. M. Carl, T. Fischbach, P. Koepke, R. Miller, M. Nasfi, G. Weckbecker, The basic theory of
infinite time register machines. Arch. Math. Log. 49(2), 249–273 (2010)
6. B. Dawson, Ordinal time Turing computation. Ph.D. thesis, Bristol (2009)
7. H. Field, A revenge-immune solution to the semantic paradoxes. J. Philos. Log. 32(3), 139–177
(2003)
8. S.-D. Friedman, P.D. Welch, Two observations concerning infinite time Turing machines, in
BIWOC 2007 Report, ed. by I. Dimitriou (Hausdorff Centre for Mathematics, Bonn, 2007),
pp. 44–47. Also at http://www.logic.univie.ac.at/sdf/papers/joint.philip.ps
9. S.-D. Friedman, P.D. Welch, Hypermachines. J. Symb. Log. 76(2), 620–636 (2011)
10. E. Gold, Limiting recursion. J. Symb. Log. 30(1), 28–48 (1965)
11. J.D. Hamkins, A. Lewis, Infinite time Turing machines. J. Symb. Log. 65(2), 567–604 (2000)
12. J.D. Hamkins, A. Lewis, Post’s problem for supertasks has both positive and negative solutions.
Arch. Math. Log. 41, 507–523 (2002)
13. J.D. Hamkins, R. Miller, Post’s problem for ordinal register machines: an explicit approach.
Ann. Pure Appl. Log. 160(3), 302–309 (2009)
14. J.D. Hamkins, D. Seabold, Infinite time Turing machines with only one tape. Math. Log. Q.
47(2), 271–287 (2001)
15. H.G. Herzberger, Notes on naive semantics. J. Philos. Log. 11, 61–102 (1982)
16. S.C. Kleene, Recursive quantifiers and functionals of finite type I. Trans. Am. Math. Soc. 91,
1–52 (1959)
17. S.C. Kleene, Turing-machine computable functionals of finite type I, in Proceedings 1960
Conference on Logic, Methodology and Philosophy of Science (Stanford University Press,
1962), pp. 38–45
18. S.C. Kleene, Turing-machine computable functionals of finite type II. Proc. Lond. Math. Soc.
12, 245–258 (1962)
19. S.C. Kleene, Recursive quantifiers and functionals of finite type II. Trans. Am. Math. Soc.
108, 106–142 (1963)
20. A. Klev, Magister Thesis (ILLC, Amsterdam, 2007)
21. P. Koepke, Turing computation on ordinals. Bull. Symb. Log. 11, 377–397 (2005)
Discrete Transfinite Computation 185
22. P. Koepke, M. Koerwien, Ordinal computations. Math. Struct. Comput. Sci. 16(5), 867–884
(2006)
23. P. Koepke, R. Miller, An enhanced theory of infinite time register machines, in Logic and the
Theory of Algorithms, ed. by A. Beckmann et al. Springer Lecture Notes Computer Science,
vol. 5028 (Springer, Swansea, 2008), pp. 306–315
24. P. Koepke, B. Seyfferth, Ordinal machines and admissible recursion theory. Ann. Pure Appl.
Log. 160(3), 310–318 (2009)
25. P. Koepke, R. Siders, Computing the recursive truth predicate on ordinal register machines, in
Logical Approaches to Computational Barriers, ed. by A. Beckmann et al. Computer Science
Report Series (Swansea, 2006), p. 21
26. P. Koepke, P.D. Welch, A generalised dynamical system, infinite time register machines, and
…11 -CA0 , in Proceedings of CiE 2011, Sofia, ed. by B. Löwe, D. Normann, I. Soskov, A.
Soskova (2011)
27. B. Löwe, Revision sequences and computers with an infinite amount of time. J. Log. Comput.
11, 25–40 (2001)
28. M. Minsky, Computation: Finite and Infinite Machines (Prentice-Hall, Upper Saddle River,
1967)
29. H. Putnam, Trial and error predicates and the solution to a problem of Mostowski. J. Symb.
Log. 30, 49–57 (1965)
30. H. Rogers, Recursive Function Theory. Higher Mathematics (McGraw, New York, 1967)
31. G.E. Sacks, Post’s problem, admissible ordinals and regularity. Trans. Am. Math. Soc. 124,
1–23 (1966)
32. G.E. Sacks, Higher Recursion Theory. Perspectives in Mathematical Logic (Springer,
New York, 1990)
33. J. Shepherdson, H. Sturgis, Computability of recursive functionals. J. Assoc. Comput. Mach.
10, 217–255 (1963)
34. R.A. Shore, Splitting an ˛ recursively enumerable set. Trans. Am. Math. Soc. 204, 65–78
(1975)
35. R.A. Shore, The recursively enumerable ˛-degrees are dense. Ann. Math. Log. 9, 123–155
(1976)
36. J. Thomson, Tasks and supertasks. Analysis 15(1), 1–13 (1954/1955)
37. P.D. Welch, Minimality arguments in the infinite time Turing degrees, in Sets and Proofs: Proc.
Logic Colloquium 1997, Leeds, ed. by S.B.Cooper, J.K.Truss. London Mathematical Society
Lecture Notes in Mathematics, vol. 258 (Cambridge University Press, Cambridge, 1999)
38. P.D. Welch, Eventually infinite time Turing degrees: infinite time decidable reals. J. Symb.
Log. 65(3), 1193–1203 (2000)
39. P.D. Welch, Post’s and other problems in higher type supertasks, in Classical and New
Paradigms of Computation and Their Complexity Hierarchies. Papers of the Conference
Foundations of the Formal Sciences III, ed. by B. Löwe, B. Piwinger, T. Räsch. Trends in
Logic, vol. 23 (Kluwer, Dordrecht, 2004), pp. 223–237
40. P.D. Welch, Ultimate truth vis à vis stable truth. Rev. Symb. Log. 1(1), 126–142 (2008)
41. P.D. Welch, Characteristics of discrete transfinite Turing machine models: halting times,
stabilization times, and normal form theorems. Theor. Comput. Sci. 410, 426–442 (2009)
Semantics-to-Syntax Analyses of Algorithms
Yuri Gurevich
Y. Gurevich ()
Microsoft Research, Redmond, WA, USA
e-mail: gurevich@microsoft.com
1 Introduction
This article is a much revised and extended version of our Foundational Analyses of
Computation [16].
1.1 Terminology
For the sake of brevity we introduce the term species of algorithms to mean a class
of algorithms given by semantical constraints. We are primarily interested in large
species like sequential algorithms or analog algorithms.
Q1 : Contrary to biological species, yours are not necessarily disjoint. In fact, one
of your species may include another as a subspecies.
A: This is true. For example, the species of sequential-time algorithms, that
execute step after step, includes the species of sequential algorithms, with
steps of bounded complexity.
Q: The semantic-constraint requirement seems vague.
A: It is vague. It’s purpose is just to distinguish the analyses of algorithms that
we focus upon here from other analyses of algorithms in the literature.
1
Q is our inquisitive friend Quisani, and A is the author.
Semantics-to-Syntax Analyses of Algorithms 189
2
Here and below a numerical function is a function f .x1 ; : : : ; xj /, possibly partial, of finite arity
j, where the arguments xi range over natural numbers, and the values of f —when defined—are
natural numbers.
192 Y. Gurevich
2 Turing
Alan Turing analyzed computation in his 1936 paper “On Computable Numbers,
with an Application to the Entscheidungsproblem” [23]. All unattributed quotations
in this section are from that paper.
The Entscheidungsproblem is the problem of determining whether a given first-
order formula is valid. The validity relation on first-order formulas can be naturally
represented as a real number, and the Entscheidungsproblem becomes whether
this particular real number is computable. “Although the subject of this paper
is ostensibly the computable numbers, it is almost equally easy to define and
investigate computable functions of an integral variable or a real or computable
variable, computable predicates, and so forth. The fundamental problems involved
are, however, the same in each case, and I have chosen the computable numbers for
explicit treatment as involving the least cumbrous technique.”
Q: Hmm, input strings of a Turing machine are finite which does not suffice to
represent real numbers.
A: Turing himself did not insist that input strings are finite.
Turing’s intention might have been to consider all algorithms. But algorithms of
the time were sequential, and the computers were humans.3 So Turing analyzed
sequential algorithms performed by idealized human computers. In the process
he explicated some inherent constraints of the species and imposed—explicitly or
implicitly—some additional constraints. Here are the more prominent constraints of
Turing’s analysis.
Digital Computation is digital (or symbolic, symbol-pushing).
“Computing is normally done by writing certain symbols on paper.”
Q: Is the digital constraint really a constraint?
A: These days we are so accustomed to digital computations that the digital
constraint may not look like a constraint. But it is. Non-digital computations
have been performed by humans from ancient times. Think of ruler-and-
compass computations or of Euclid’s algorithm for lengths [15, Sect. 3].
3
“Numerical calculation in 1936 was carried out by human beings; they used mechanical aids for
performing standard arithmetical operations, but these aids were not programmable” (Gandy [10,
p. 12]).
Semantics-to-Syntax Analyses of Algorithms 193
4
This was pointed out to us by the anonymous referee.
194 Y. Gurevich
How can one analyze the great and diverse variety of computations performed by
human computers? Amazingly Turing found a way to do that. We believe that his
fulcrum was as follows. Ignore what a human computer has in mind and concentrate
on what the computer does and what the observable behavior of the computer is. In
other words, Turing treated the idealized human computer as an operating system
of sorts.
One may argue that Turing did not ignore the computer’s mind. He spoke about
the state of mind of the human computer explicitly and repeatedly. Here is an
example. “The behaviour of the computer at any moment is determined by the
symbols which he is observing, and his ‘state of mind’ at that moment.” But Turing
postulated that “the number of states of mind which need be taken into account is
finite.” The computer just remembers the current state of mind, and even that is not
necessary: “we avoid introducing the ‘state of mind’ by considering a more physical
and definite counterpart of it. It is always possible for the computer to break off from
his work, to go away and forget all about it, and later to come back and go on with it.
If he does this he must leave a note of instructions (written in some standard form)
explaining how the work is to be continued. This note is the counterpart of the ‘state
of mind’.”
Q: I came across a surprising remark of Gödel that Turing’s argument “is supposed
to show that mental procedures cannot go beyond mechanical procedures”
[11]. It is hard for me to believe that this really was Turing’s goal. Anyway,
Gödel continues thus. “What Turing disregards completely is the fact that
mind, in its use, is not static, but constantly developing, i.e., that we understand
abstract terms more and more precisely as we go on using them, and that more
and more abstract terms enter the sphere of our understanding. There may exist
systematic methods of actualizing this development, which could form part of
the procedure” [11].
Do you understand that? Apparently Gödel thought that gifted mathematicians
may eventually find a sophisticated decision procedure for the Entschei-
dungsproblem that is not mechanical. But if gifted mathematicians are able
to reliably execute the procedure, they should be able to figure out how to
program it, and then the procedure is mechanical.
A: Maybe Gödel was just pointing out that, in solving instances of Entschei-
dungsproblem, human creativity would outperform any mechanical procedure.
Turing would surely agree with that.
Q: Let me change the topic. Here is another interesting quote. “For the actual
development of the (abstract) theory of computation, where one must build up
a stock of particular functions and establish various closure conditions, both
Church’s and Turing’s definitions are equally awkward and unwieldy. In this
respect, general recursiveness is superior” (Sol Feferman, [10, p. 6]). Do you
buy that?
A: Indeed, the recursive approach has been dominant in mathematical logic. It
is different though in computer science where Turing’s approach dominates.
Turing’s machine model enabled computational complexity theory and even
influenced the early design of digital computers. Church’s -calculus has been
influential in programming language theory.
196 Y. Gurevich
3 Kolmogorov
Like Turing, Kolmogorov might have intended to analyze all algorithms. The
algorithms of his time still were sequential. In the 1953 talk, Kolmogorov stipulated
that every algorithmic process satisfies the following constraints.
Sequentiality An algorithmic process splits into steps whose complexity is
bounded in advance.
Elementary Steps Each step consists of a direct and unmediated transformation
of the current state S to the next state S .
Locality Each state S has an active part of bounded size. The bound
does not depend on the state or the input size, only on the
algorithm itself. The direct and unmediated transformation of
S to S is based only on the information about the active part
of S and applies only to the active part.
Implicitly Kolmogorov presumes also that the algorithm does not interact with
its environment, so that a computation is a sequence S0 ; S1 ; S2 ; : : : of states, possibly
infinite, where every SnC1 D Sn
Q: The second stipulation does not seem convincing to me. For example, a
sequential algorithm may multiply and divide integers in one step. Such
transformations do not look direct and immediate in some absolute sense.
A: Kolmogorov restricts attention to sequential algorithms working on the lowest
level of abstraction, on the level of single bits.
5
Uspensky told us that the summary [18] of the 1953 talk was written by him after several
unsuccessful attempts to make Kolmogorov to write a summary.
Semantics-to-Syntax Analyses of Algorithms 197
the number of edges attached to any vertex), with a fixed number of types of
vertices and a fixed number of types of edges. We speculated in [12] that “the
thesis of Kolmogorov and Uspensky is that every computation, performing only
one restricted local action at a time, can be viewed as (not only being simulated
by, but actually being) the computation of an appropriate KU machine.” Uspensky
agreed [25, p. 396].
We do not know much about the analysis that led Kolmogorov and Uspensky
from the stipulations above to their machine model. “As Kolmogorov believed,”
wrote Uspensky [25, p. 395], “each state of every algorithmic process . . . is an
entity of the following structure. This entity consists of elements and connections;
the total number of them is finite. Each connection has a fixed number of elements
connected. Each element belongs to some type; each connection also belongs to
some type. For every given algorithm the total number of element types and the
total number of connection types are bounded.” In that approach, the number of
non-isomorphic active zones is finite (because of a bound on the size of the active
zones), so that the state transition can be described by a finite program.
Leonid Levin told us that Kolmogorov thought of computation as a physical
process developing in space and time (Levin, private communication, 2003). That
seems to be Kolmogorov’s fulcrum. In particular, the edges of the state graph
of a Kolmogorov machine reflect physical closeness of computation elements.
One difficulty with this approach is that there may be no finite bound on the
dimensionality of the computation space [14, footnote 1].
Q: I would think that Kolmogorov’s analysis lent support to the Church-Turing
thesis.
A: It did, to the extent that it was independent from Turing’s analysis. We discuss
the issue in greater detail in [8, Sect. 1.2].
Q: You mentioned that Turing’s machine model enabled computational com-
plexity theory. Was the Kolmogorov-Uspensky machine model useful beyond
confirming the Church-Turing thesis?
A: Very much so; please see [2, Sect. 3].
4 Gandy
Gandy analyzed computation in his 1980 paper “Church’s Thesis and Principles for
Mechanisms” [9]. In this section, by default, quotations are from that paper.
Turing’s analysis of computation by a human being does not apply directly to mechanical
devices . . . Our chief purpose is to analyze mechanical processes and so to provide
arguments for . . .
Thesis M. What can be calculated by a machine is computable.
Since mechanical devices can perform parallel actions, Thesis M “must take
parallel working into account.” But the species of all mechanical devices is too hard
to analyze, and Gandy proceeds to narrow it to a species of mechanical devices that
he is going to analyze.
(1) In the first place I exclude from consideration devices which are essentially analogue
machines. . . . I shall distinguish between “mechanical devices” and “physical devices” and
consider only the former. The only physical presuppositions made about mechanical devices
. . . are that there is a lower bound on the linear dimensions of every atomic part of the
device and that there is an upper bound (the velocity of light) on the speed of propagation
of changes.
(2) Secondly we suppose that the progress of calculation by a mechanical device may be
described in discrete terms, so that the devices considered are, in a loose sense, digital
computers.
(3) Lastly we suppose that the device is deterministic; that is, the subsequent behaviour of
the device is uniquely determined once a complete description of its initial state is given.
After these clarifications we can summarize our argument for a more definite version of
Thesis M in the following way.
Thesis P. A discrete deterministic mechanical device satisfies principles I–IV below.
Gandy’s Principle I asserts in particular that, for any mechanical device, the states
can be described by hereditarily finite sets6 and there is a transition function F such
that, if x describes an initial state, then Fx; F.Fx/; : : : describe the subsequent states.
Gandy wants “the form of description to be sufficiently abstract to apply uniformly
to mechanical, electrical or merely notional devices,” so the term mechanical device
is treated liberally.
Principles II are III are technical restrictions on the state descriptions and the
transition function respectively. Principle IV generalizes Kolmogorov’s locality
constraint to parallel computations.
We now come to the most important of our principles. In Turing’s analysis the requirement
that the action depend only on a bounded portion of the record was based on a human
limitation. We replace this by a physical limitation [Principle IV] which we call the principle
of local causation. Its justification lies in the finite velocity of propagation of effects and
signals: contemporary physics rejects the possibility of instantaneous action at a distance.
6
A set x is hereditarily finite if its transitive closure TC.x/ is finite. Here TC.x/ is the least set t
such that x 2 t and such that z 2 y 2 t implies z 2 t.
Semantics-to-Syntax Analyses of Algorithms 199
4.3 Comments
5 Sequential Algorithms
5.1 Motivation
By the 1980s, there were plenty of computers and software. A problem arose how
to specify software. The most popular theoretical approaches to this problem were
declarative. And indeed, declarative specifications (or specs) tend to be of higher
abstraction level and easier to understand than executable specs. But executable
specs have their own advantages. You can “play” with them: run them, test, debug.
Q: If your spec is declarative then, in principle, you can verify it mathematically.
A: That is true, and sometimes you have to verify your spec mathematically;
there are better and better tools to do that. In practice though, mathematical
verification is out of the question in an overwhelming majority of cases,
and the possibility to test specs is indispensable, especially because software
evolves. In most cases, it is virtually impossible to keep a declarative spec in
sync with the implementation. In the case of an executable spec, you can test
whether the implementation conforms to the spec (or, if the spec was reverse-
engineered from an implementation, whether the spec is consistent with the
implementation).
A question arises whether an executable spec has to be low-level and detailed?
This leads to a foundational problem whether any algorithm can be specified, in an
executable way, on its intrinsic level of abstraction.
Q: A natural-language spec would not do as it is not executable.
A: Besides, such a spec may (and almost invariably does) introduce ambiguities
and misunderstanding.
Q: You can program the algorithm in a conventional programming language but
this will surely introduce lower-level details.
A: Indeed, even higher-level programming languages tend to introduce details that
shouldn’t be in the spec.
Semantics-to-Syntax Analyses of Algorithms 201
Turing and Kolmogorov machines are executable but low-level. Consider for
example two distinct versions of Euclid’s algorithm for the greatest common divisor
of two natural numbers: the ancient version where you advance by means of
differences, and a modern (and higher-level) version where you advance by means
of divisions. The chances are that, in the Turing machine implementation, the
distinction disappears.
Can one generalize Turing and Kolmogorov machines in order to solve the
foundational problem in question? The answer turns out to be positive, at least
for sequential algorithms [14], synchronous parallel algorithms [1], and interactive
algorithms [3, 4]. We discuss here only the first of these.
Let’s restrict attention to the species of sequential algorithms but without any
restriction on the abstraction level. It could be the Gauss Elimination Procedure
for example. Informally, paraphrasing the first stipulation in Sect. 3, an algorithm
is sequential if it computes in steps whose complexity is bounded across all
computations of the algorithm. In the rest of this section, algorithms are by default
sequential.
We use the axiomatic method to explicate the species. The first axiom is rather
obvious.
Axiom 1 (Sequential Time) Any algorithm A is associated with a nonempty
collection S.A/ of states, a sub-collection I.A/ S.A/ of initial states and a
(possibly partial) state transition map A W S.A/ ! S.A/.
Definition 1 Two algorithms are behaviorally equivalent if they have the same
states, the same initial states and the same transition function.
results of the exploration of the active zone. Formally, A .X/ can be defined as a
collection of assignments F.Na/ WD b where F is a vocabulary function.
Q: What about vocabulary relations? Are they necessarily static?
A: We view relations as Boolean-valued functions, so the vocabulary relations
may be updatable as well.
Q: How does the algorithm know what to explore and what to change?
A: That information is supplied by the program, and it is applicable to all the
states. In the light of the abstract-state axiom, it should be given symbolically,
in terms of the vocabulary of A.
Axiom 3 (Bounded Exploration) There exists a finite set T of terms (or expres-
sions) in the vocabulary of algorithm A such that A .X/ D A .Y/ whenever states
X; Y of A coincide over T.
Now we are ready to define (sequential) algorithms.
Definition 2 A (sequential) algorithm is any entity that satisfies the sequential-
time, abstract-state and bounded-exploration axioms.
Abstract state machines (ASMs) were defined in [13]. Here we restrict attention
to sequential ASMs which are undeniably algorithms.
Theorem 1 ([14]) For every algorithm A, there exists a sequential ASM that is
behaviorally equivalent to A.
Every sequential algorithm has its native level of abstraction. On that level, the states
can be faithfully represented by first-order structures of a fixed vocabulary in such
a way that state transitions can be expressed naturally in the language of the fixed
vocabulary.
Q7 : I can’t ask Turing, Kolmogorov or Gandy how they arrived to their fulcrums,
but I can ask you that question.
A: In 1982 we moved abruptly from logic to computer science. We tried to
understand what was that emerging science about. The notion of algorithm
seemed central. Compilers, programming languages, operating systems are
all algorithms. As far as sequential algorithms are concerned, the sequential-
time axiom was obvious.
Q: How do you view a programming language as an algorithm?
A: It runs a given program on given data.
7
This discussion is provoked by the anonymous referee who thought that the two sentences above
are insufficient for this subsection.
204 Y. Gurevich
6 Final Remarks
kinds of algorithms may be introduced and most probably will be. Will the notion
of algorithms ever crystallize to support rigorous definitions? We doubt that.
“However the problem of rigorous definition of algorithms is not hopeless. Not at
all. Large and important strata of algorithms have crystallized and became amenable
to rigorous definitions” [15]. In Sect. 5, we explained the axiomatic definition
of sequential algorithms. That axiomatic definition was extended to synchronous
parallel algorithms in [1] and to interactive sequential algorithms in [3, 4].
The axiomatic definition of sequential algorithms was also used in to derive
Church’s thesis from the three axioms plus an additional Arithmetical State axiom
which asserts that only basic arithmetical operations are available initially [8].
Q: I wonder whether there is any difference between the species of all algorithms
and that of machine algorithms.
A: This is a good point, though there may be algorithms executed by nature that
machines can’t do. In any case, our argument that the species of all algorithms
can’t be formalized applies to the species of machine algorithms. The latter
species also evolves and may never crystallize.
Acknowledgements Many thanks to Andreas Blass, Bob Soare, Oron Shagrir and the anonymous
referee for useful comments.
References
1. A. Blass, Y. Gurevich, Abstract state machines capture parallel algorithms. ACM Trans.
Comput. Log. 4(4), 578–651 (2003). Correction and extension, same journal 9, 3 (2008),
article 19
2. A. Blass, Y. Gurevich, Algorithms: a quest for absolute definitions, in Current Trends in
Theoretical Computer Science, ed. by G. Paun et al. (World Scientific, 2004), pp. 283–311; in
Church’s Thesis After 70 Years, ed. by A. Olszewski (Ontos Verlag, Frankfurt, 2006), pp. 24–57
3. A. Blass, Y. Gurevich, Ordinary interactive small-step algorithms. ACM Trans. Comput. Log.
7(2), 363–419 (2006) (Part I), plus 8:3 (2007), articles 15 and 16 (Parts II and III)
4. A. Blass, Y. Gurevich, D. Rosenzweig, B. Rossman, Interactive small-step algorithms. Log.
Methods Comput. Sci. 3(4), 1–29, 1–35 (2007), papers 3 and 4 (Part I and Part II)
5. A. Blass, N. Dershowitz, Y. Gurevich, When are two algorithms the same? Bull. Symb. Log.
15(2), 145–168 (2009)
6. A. Church, An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363
(1936)
7. E.F. Codd, Relational model of data for large shared data banks. Commun. ACM 13(6), 377–
387 (1970)
8. N. Dershowitz, Y. Gurevich, A natural axiomatization of computability and proof of Church’s
thesis. Bull. Symb. Log. 14(3), 299–350 (2008)
9. R.O. Gandy, Church’s thesis and principles for mechanisms, in The Kleene Symposium, ed. by
J. Barwise et al. (North-Holland, Amsterdam, 1980), pp. 123–148
10. R.O. Gandy, C.E.M. (Mike) Yates (eds.), Collected Works of A.M. Turing: Mathematical Logic
(Elsevier, Amsterdam, 2001)
11. K. Gödel, A philosophical error in Turing’s work, in Kurt Gödel: Collected Works, vol. II, ed.
by S. Feferman et al. (Oxford University Press, Oxford, 1990), p. 306
206 Y. Gurevich
12. Y. Gurevich, On Kolmogorov machines and related issues. Bull. Eur. Assoc. Theor. Comput.
Sci. 35, 71–82 (1988)
13. Y. Gurevich, Evolving algebra 1993: Lipari guide, in Specification and Validation Methods, ed.
by E. Börger, (Oxford University Press, Oxford, 1995), pp. 9–36
14. Y. Gurevich, Sequential abstract state machines capture sequential algorithms. ACM Trans.
Comput. Log. 1(2), 77–111 (2000)
15. Y. Gurevich, What is an algorithm? in SOFSEM 2012: Theory and Practice of Computer
Science. Springer LNCS, vol. 7147, ed. by M. Bielikova et al. (Springer, Berlin, 2012). A
slight revision will appear in Proc. of the 2011 Studia Logica conference on Church’s Thesis:
Logic, Mind and Nature
16. Y. Gurevich, Foundational analyses of computation, in How the World Computes. Turing
Centennial Conference. Springer LNCS, vol. 7318, ed. by S.B. Cooper et al. (Springer, Berlin,
2012), pp. 264–275
17. S.C. Kleene, Introduction to Metamathematics (D. Van Nostrand, Princeton, 1952)
18. A.N. Kolmogorov, On the concept of algorithm. Usp. Mat. Nauk 8(4), 175–176 (1953). Russian
19. A.N. Kolmogorov, V.A. Uspensky, On the definition of algorithm. Usp. Mat. Nauk 13(4), 3–28
(1958). Russian. English translation in AMS Translations 29, 217–245 (1963)
20. A.A. Markov, Theory of Algorithms. Transactions of the Steklov Institute of Mathematics, vol.
42 (1954). Russian. English translation by the Israel Program for Scientific Translations, 1962;
also by Kluwer, 2010
21. E.L. Post, Finite combinatorial processes—formulation I. J. Symb. Log. 1, 103–105 (1936)
22. W. Sieg, On computability, in Handbook of the Philosophy of Mathematics, ed. by A. Irvine
(Elsevier, Amsterdam, 2009), pp. 535–630
23. A.M. Turing, On computable numbers, with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. Ser. 2 42, 230–265 (1936/1937)
24. A.M. Turing, Systems of logic based on ordinals. Proc. Lond. Math. Soc. Ser. 2 45, 161–228
(1939)
25. V.A. Uspensky, Kolmogorov and mathematical logic. J. Symb. Log. 57(2), 385–412 (1992)
26. V.A. Uspensky, A.L. Semenov, Theory of Algorithms: Main Discoveries and Applications
(Nauka, Moscow, 1987) in Russian; (Kluwer, Dordrecht, 2010) in English
27. J. Wiedermann, J. van Leeuwen, Rethinking computation, in Proceedings of 6th AISB
Symposium on Computing and Philosophy. Society for the Study of Artificial Intelligence and
the Simulation of Behaviour, ed. by M. Bishop, Y.J. Erden (Exeter, London, 2013), pp. 6–10
The Information Content of Typical Reals
1 Introduction
The study of the structure of the degrees of unsolvability dates back to Kleene
and Post [10]. The same is true of the application of Baire category methods in
this study, while the application of probabilistic techniques in relative computation
dates back to de Leeuw et al. [6]. In this article we give an overview of the state
of the art in degree theory in terms of category and measure. Recent interest in this
topic has been motivated by questions in algorithmic randomness, but there is an
G. Barmpalias ()
State Key Lab of Computer Science, Institute of Software, Chinese Academy of Sciences, 100190
Beijing, China
School of Mathematics, Statistics and Operations Research, Victoria University, Wellington,
New Zealand
e-mail: barmpalias@gmail.com
http://www.barmpalias.net
A. Lewis-Pye
Department of Mathematics, Columbia House, London School of Economics, Houghton Street,
London WC2A 2AE, UK
e-mail: andy@aemlewis.co.uk
http://aemlewis.co.uk
essential distinction to be drawn between much of the research that takes place in
algorithmic randomness and our interests here: while in algorithmic randomness
one is concerned with understanding the properties of random reals quite generally,
here we are interested specifically in the properties of the degrees of random reals.
Formalising a notion of typicality essentially amounts to defining a notion of
size, and then defining the typical objects to be those which belong to all large sets
from some restricted (normally countable) class. It is then interesting to note that,
beyond cardinality, there are two basic notions of size for sets of reals. One can
think in terms of measure, the large sets being those of measure 1, or in terms of
category, the large sets being those which are comeager. In [17] Kunen provided, in
fact, a way of formalising the question as to whether these are the only “reasonable”
notions of size for sets of reals.
Of course one cannot expect a real not to belong to any set of measure 0, or
not to belong to any meager set, and so to formalise a level of typicality one
restricts attention to sets of reals which are definable in some specific sense—one
might consider all arithmetically definable sets of reals which are of measure 0,
for example, giving a corresponding class of “typical reals” which do not belong
to any such set. Working both in terms of category and measure, we are then
interested in establishing the order-theoretic properties of the “typical” degrees,
i.e. those containing typical reals. Despite the large advances in degree theory over
the last 60 years, our knowledge on this issue is rather limited. On the other hand,
some progress has been made recently, which also points to concrete directions and
methodologies for future research. We present old and new results on this topic, and
ask a number of questions whose solutions will help obtain a better understanding
of the properties of typical degrees.
In order to increase readability we have presented a number of results in tables,
rather than a list of theorem displays. For the same reason, a number of citations
of results are suppressed. For the remainder of this introductory section we discuss
some of the background concepts which we shall require from computability theory.
For a thorough background in classical computability theory we refer to Odifreddi
[25], Downey and Hirshfeldt [5], and Nies [23], while [1] is a thorough historical
survey on the topic.
have solutions. Given access to the bits of the real, we may be able to derive the
correct answer just by checking if a particular bit of the real is a 0 or a 1. So reals
are identified with subsets of the natural numbers, which in turn can be viewed as
solutions to problems, and in this context we are often not overly concerned with the
properties of the real as an element of the standard algebraic structure, but rather in
the nature of the information that it encodes and perhaps the features of the encoding
itself.
The Turing reducibility is a preorder that compares reals according to their
information content and was introduced (implicitly) by Turing in [32] and was
explicitly studied by Kleene and Post in [10]. The formal definition1 makes use of
oracle Turing machines, which are Turing machines with an extra oracle tape upon
which the (potentially non-computable) binary expansion of a real may be written.
We say that A is Turing reducible to B (in symbols, A T B) if there is an oracle
Turing machine which calculates the bits of A when the binary expansion of B is
written on the oracle tape. One can think of this as meaning quite simply that one
can move in an algorithmic fashion from the binary expansion of B to the binary
expansion of A. So the reducibility formalises the notion that the information in A
is encoded in (and hence, can be algorithmically retrieved from) B.
When A T B and B T A, we regard A and B as having the same information
content. Hence the preorder T induces an equivalence relation on the continuum
which identifies reals with the same information content. These equivalence classes
are called Turing degrees, and we consider the natural ordering on these degrees
inherited by the Turing reducibility. Considering reals as solutions to problems
(or rather problem sets indexed by natural numbers), the degrees of unsolvability
can then be viewed as ordering problems or computational tasks according to their
level of difficulty. We often denote by a; b; c the degrees of the problems A; B; C
respectively. Hence if a b then a solution to (any problem in) b is enough to
provide a solution to (any problem in) a.
The Turing degree structure provides a way to formally address and study the
classification of problems according to their difficulty. A large part of this study
has focused on establishing basic properties of the partial order, beginning already
in [10]. It was observed there, for example, that the degree structure forms an
upper semi-lattice with a least element and the countable predecessor property.
Early research then focused on exhibiting degrees with certain structural properties,
for instance minimal pairs and minimal degrees. We shall focus here on natural
properties which are first order definable in the language for the structure, i.e. the
language of partial orders.
1
Turing’s original definition did not involve an extra tape. However it is equivalent to the standard
formulation that we give here.
210 G. Barmpalias and A. Lewis-Pye
Measure and Baire category arguments in degree theory are as old as the subject
itself. For example, Kleene and Post [10] used arguments that resemble the Baire
category theorem construction in order to build Turing degrees with certain basic
properties. Moreover de Leeuw et al. [6] used a so-called “majority vote argument”
in order to show that if a subset of ! can be enumerated relative to every set in
a class of positive measure then it has an unrelativised computable enumeration.
Spector [30] used a measure theoretic argument in order to produce incomparable
hyperdegrees. Myhill [22], on the other hand, advocated for and demonstrated the
use of Baire category methods in degree theory. Sacks’ monograph [27] included
a chapter on “Measure-theoretic, category and descriptive set-theoretic arguments”,
where he shows that the minimal degrees have measure 0.
A highly influential yet unpublished manuscript by Martin (1967, Measure,
category, and degrees of unsolvability, Unpublished manuscript) showed that more
advanced degree-theoretic results are possible using these classical methods. By that
time degree theory was evolving into a highly sophisticated subject and the point of
this paper was largely that category and measure can be used in order to obtain
advanced results, which go well beyond the basic methods of Kleene and Post [10].
Of the two results in Martin (1967, Measure, category, and degrees of unsolvability,
The Information Content of Typical Reals 211
Unpublished manuscript) the first was that the Turing upward closure of a meager
set of degrees that is downward closed amongst the non-zero degrees, but which
does not contain 0, is meager (see [25, Sect. V.3] for a concise proof of this). Given
that the minimal degrees form a meager class, an immediate corollary of this was
the fact that there are non-zero degrees that do not bound minimal degrees. The
second result was that the measure of the hyperimmune degrees is 1. Martin’s paper
was the main inspiration for much of the work that followed in this topic, including
[8, 26, 33].
Martin’s early work seemed to provide some hope that measure and category
arguments could provide a simple alternative to conventional degree-theoretic
constructions which are often very complex. This school of thought received a
serious blow, however, with [26]. Paris answered positively a question of Martin
which asked if the analogue of his category result in Martin (1967, Measure,
category, and degrees of unsolvability, Unpublished manuscript) holds for measure:
are the degrees that do not bound minimal degrees of measure 1? Paris’ proof
was considerably more involved than the measure construction in Martin (1967,
Measure, category, and degrees of unsolvability, Unpublished manuscript) and
seemed to require sophisticated new ideas. The proposal of category methods as a
simple alternative to “traditional” degree theory had a similar fate. Yates [33] started
working on a new approach to degree theory that was based on category arguments
and was even writing a book on this topic. Unfortunately the merits of his approach
were not appreciated at the time (largely due to the heavy notation that he used) and
he gave up research on the subject altogether.
Yates’ work in [33] deserves a few more words, however, especially since it
anticipated much of the work in [8]. Inspired by Martin (1967, Measure, category,
and degrees of unsolvability, Unpublished manuscript), Yates started a systematic
study of degrees in the light of category methods. A key feature in this work
was an explicit interest in the level of effectivity possible in the various category
constructions and the translation of this level of effectivity into category concepts
(like “00 -comeager” etc.). Using his own notation and terminology, he studied the
level of genericity that is sufficient in order to guarantee that a set belongs to certain
degree-theoretic comeager classes, thus essentially defining various classes of
genericity already in 1974. He analysed Martin’s proof that the Turing upper closure
of a meager class which is downward closed amongst the non-zero degrees but
which does not contain 0 is meager, for example (see [33, Sect. 5]), and concluded
that no 2-generic degree bounds a minimal degree. Moreover, he conjectured (see
[33, Sect. 6]) that there is a 1-generic that bounds a minimal degree. These concerns
occurred later in a more appealing form in Jockusch [8], where simpler terminology
was used and the hierarchy of n-genericity was explicitly defined and studied.
With Jockusch [8], the heavy notation of Yates was dropped and a clear and
systematic calibration of effective comeager classes (mainly the hierarchy of n-
generic sets) and their Turing degrees was carried out. A number of interesting
results were presented along with a long list of questions that set a new direction
for future research. The latter was followed up by Kumabe [12–16] (as well as other
authors, e.g. [3]) who answered a considerable number of these questions.
212 G. Barmpalias and A. Lewis-Pye
The developments in the measure approach to degree theory were similar but
considerably slower, at least in the beginning. Kurtz’s thesis [18] is probably the first
systematic study of the Turing degrees of the members of effectively large classes of
reals, in the sense of measure. Moreover the general methodology and the types of
questions that Kurtz considers are entirely analogous to the ones proposed in [8] for
the category approach (e.g. studying the degrees of the n-random reals as opposed to
the n-generic reals, minimality, computable enumerability and so on). Kučera [11]
focused on the degrees of 1-random reals. Kautz [9] continued in the direction of
Kurtz [18] but it was not until the last 10 years (and in particular with the writing of
Downey and Hirshfeldt [5, Chap. 8]) that the study of the degrees of n-random reals
became well known and this topic became a focused research area.
1.4 Overview
Building on the long history of category and measure argument for the study of
the degrees of unsolvability, this paper can be seen as an explicit proposal for a
systematic analysis of the order theoretically definable properties satisfied by the
typical Turing degree, where typicality is gauged in terms of category or measure.
The main part of this article is organised in three sections. In Sect. 2 we give
precise definitions of typicality in terms of measure and category, and how one can
calibrate typicality by refining these definitions in terms of the classical definability
hierarchies. We also discuss for which properties of degrees it makes sense to ask
our main question (i.e. whether they are typical) and what kind of answers can be
expected. We focus on a list of rather basic properties, introduced in Table 1. These
properties have received special attention in the study of degrees over the years.
In Sect. 3 we describe all known results concerning which of these properties are
typical. We also consider the question as to which properties are inherited by all
(non-zero) predecessors of sufficiently typical degrees.
In Sect. 4 we discuss the similarities and differences between the two faces of
typicality that we have considered. After some superficial remarks, we present the
recent results of Shore on the theories of the lower cones of typical degrees. Both
of these results are motivated by the fact that, fixing a notion of typicality and
given two typical degrees, the first order theories of the corresponding lower cones
(in the language of partial orders and with the inherited ordering) are equal. The
first result determines the level of typicality that is required for this fact to hold
(namely, arithmetical genericity or randomness). The second result shows that the
two different kinds of typicality give rise to different corresponding theories.
Given a property which refers to a real (or a degree), measure and category
arguments may aim to show that there exist reals that satisfy this property by
demonstrating that the property is typical. In other words “most” reals satisfy this
property. This approach is based on:
(a) a formalisation of the notion of “large”;
(b) a restriction of the sets of reals that we consider to a countable class;
(c) the definition “typical real” as a real which occurs in every large set in this
restricted class.
Of course, in terms of category “large” means comeager, while in terms of measure,
“large” means “of measure 1”. Restricting attention to those reals which belong to
every member of a countable class of large sets, still leaves us with a large class:
(1) the intersection of countably many comeager sets is comeager;
(2) the intersection of countably many measure 1 sets has measure 1.
Those reals which are typical, we call generic for the category case, and random if
working with measure—both of these notions clearly depending on the countable
class specified in (b). The “default” for the generic case is to consider the countable
collection of sets which are definable in first order arithmetic, giving “arithmetical
genericity” (or “arithmetical randomness” accordingly). It is quite common when
discussing genericity to suppress the prefix “arithmetical”, so that by “generic”
is often meant arithmetical genericity. Alternative choices are possible, resulting
in stronger or weaker genericity and randomness notions. For example, we may
consider a randomness notion that is defined with respect to the hyperarithmetical
sets—for more information on such strong notions of randomness we refer to Nies
[23, Chap. 9].
In the computability context though, it is normally appropriate to consider a finer
hierarchy and much weaker randomness and genericity notions. To this end, we
214 G. Barmpalias and A. Lewis-Pye
Before we embark to the pursuit of this quest, let us examine the answers that are
possible. It is reasonable to expect that some structural properties hold for all generic
reals while they fail to hold for all random reals? Indeed, as we noted above, while
genericity and randomness might be said to formalise the same intuitive notion of
typicality, on a technical level they are very different notions. In fact, as we remark
in Sect. 4, the two classes and their Turing degrees form disjoint classes.
The Information Content of Typical Reals 215
Is it reasonable to expect that for every property P which is definable in some sense,
either P is met by all typical reals or else that :P is met by all typical reals? We
have already discussed the fact that there are many levels of typicality that one may
consider, for both randomness and genericity. Let T be a certain level of genericity
or randomness. In the case that P is met by all reals in T or :P is met or by all
reals in T we say that P is decided by T . Hence we are essentially asking whether
every definable property P is decided by some typicality level T . At this point
let us consider the cases of randomness and genericity separately. For the case of
randomness, Kolmogorov’s 0-1 law states that any (Lebesgue) measurable tailset2
is either of measure 0 or 1. We may identify a set of Turing degrees with the set
of reals contained in those degrees (i.e. with the union). In this sense, every set
of Turing degrees is a tailset. Hence any measurable set of Turing degrees must
either be of measure 0 or 1. So if we restrict question (1) to arithmetically definable
properties of degrees P (i.e. sets of degrees definable in the structure, and for which
the union is arithmetically definable as a set of reals), then the satisfying class is a
set of measure 0 or 1, so either all arithmetically random degrees x satisfy :P or all
arithmetically random degrees satisfy P (respectively).
More generally we could restrict question (1) to Borel properties P, and the same
considerations would hold. Ultimately, we would like to consider all properties P
which are definable in the structure of the Turing degrees, in the first order language
of partial orders. As was demonstrated in [2, Sect. 3], however, whether all definable
sets of degrees are measurable is independent from ZFC. Hence in this more general
form, question (1) may not always admit a clear answer. In our discussions on
randomness, most of the properties P that we consider are arithmetically definable.
In the cases where P is not evidently arithmetically definable (see the cupping
property in Table 1) we show that (arithmetical) randomness suffices in order to
decide this property.
In the case of genericity we can consider the topological 0-1 law which says
that tailsets satisfying the property of Baire3 are either meager or comeager. Since
all Borel sets of reals have the property of Baire, if we restrict question (1) to
Borel properties P we can expect a definite answer. In other words, in this case
there is a level of genericity that decides P. In the case where P is restricted to
the arithmetically definable properties, we can expect that n-genericity for some
n 2 ! decides P. We may also want to consider the more general case where P is
definable in the structure of the degrees (again, in the first-order language of partial
orders). Unfortunately, it was observed in [2, Sect. 3] that in this case we cannot
expect a definite answer to question (1). Indeed, it is independent of ZFC whether
2
A set of reals C is a tailset if for every real A and every 2 2<! , A 2 C iff A 2 C (where
denotes concatenation).
3
A set of reals has the property of Baire if its symmetric difference from some open set is meager.
216 G. Barmpalias and A. Lewis-Pye
all definable sets of degrees are either meager or comeager. On the other hand, it
is well known that under the axiom of determinacy AD every set of reals has the
property of Baire. Hence in ZF C AD every property P is decided by some level of
genericity.
In conclusion, our project consists of considering various structural properties
P of the Turing degrees and establishing a level of randomness or genericity that
decides P. If P is Borel then we can expect this task to have a clear solution.
Otherwise it is possible that none of the standard levels of randomness or genericity
decides P. The properties that we consider are in a certain sense simple, although
not always (obviously) Borel. Moreover, according to the known results, they all
have a typicality level that decides them, and this is often much lower than the level
that is guaranteed by their complexity. This latter observation is evident from an
inspection of Tables 1 and 2.
Our search for the properties of the typical degree starts by considering a small
collection of properties which can be considered as “natural” in the sense that they
are encountered in most considerations in classical degree theory. Let us start by
noting that there are very simple properties, like density, for which it is unknown
whether they are typical. This may come as a surprise to the reader, given that the
study of the degrees of unsolvability dates back to Kleene and Post [10].
Question 1 Are the random degrees dense?
Formally, is it true that given any two sufficiently random degrees a < b, the interval
.a; b/ is nonempty? A variation of this question can be stated in terms of a property
The Information Content of Typical Reals 217
of a single degree: is it true that for any sufficiently random degree b and any a < b
the interval .a; b/ is nonempty? In other words (see the relevant entry in Table 1)
we ask whether every random degree fails to be a minimal cover. This property
can be expressed with 5 alternating quantifiers in arithmetic, so we can expect that
either every 5-random real satisfies it or every 5-random real fails to satisfy it. In
particular, arithmetical randomness decides this property. Ultimately, this question
can be expressed without reference to random degrees:
Question 2 What is the measure of minimal covers?
In Sect. 3 we are going to see that in terms of genericity, typical degrees are minimal
covers. On the other hand it is a well known fact that no typical degree is minimal,
both in terms of measure and category. Other basic properties that we consider are
displayed in Table 1, along with their formal definitions. All of these properties have
straightforward first order definitions in arithmetic, except for the cupping property
and the property of having a (strong) minimal cover. However, as we discuss in the
following, these latter properties are decided by arithmetical randomness (indeed,
2-randomness). In Sect. 3 we give a full account of the known status of the properties
in Table 1 (i.e. whether they are typical). In general, more is known for genericity
than randomness.
The properties of degrees that we are going to be examining are displayed in Table 1.
From an algebraic point of view, they are quite basic properties that give essential
information about a given structure that one might be interested (in our case, the
degrees of unsolvability). Moreover, such algebraic statements can serve as building
blocks in definability investigations in degree structures (e.g. see [28]), a theme that
has been especially popular in the last 50 years of investigations in to the Turing
degrees. For further motivation with regard to the role of the properties of Table 1
in the interplay between definability and algebraic structure in the Turing degrees,
we refer to Lewis [21].
We summarise the status of the properties of Table 1 with respect to the typical
degrees in the validity columns of Table 2. Especially in the case of randomness,
there are some notable gaps in our knowledge, including the property of being
a minimal cover that was discussed in Sect. 2.3. These gaps are indicated with
a question mark in the corresponding entries of the validity columns of Table 2.
There are many more open problems here, however, than just those indicated by
question marks. In fact, there are open questions associated with almost every row
218 G. Barmpalias and A. Lewis-Pye
of the table. This is because, as discussed previously, our project does not end
upon deciding whether a property is typical. We are also interested in determining
the exact levels of the typicality hierarchies which suffice to decide each property.
This information is displayed in the columns “Level” and “Fails” of Table 2. Here
“2” means 2-genericity or 2-randomness, and similarly for “1”. Also “w2” denotes
weak 2-randomness or weak 2-genericity, and similarly for “w1”. The level of the
typicality hierarchy that is indicated under column “Level” is the lowest level of
the hierarchy which is known to decide the corresponding property. Similarly, the
level of the typicality hierarchy that is indicated under column “Fails” is the highest
level where reals have been found that give opposite answers to the validity of the
corresponding property—or simply the highest level at which it is known that the
property can fail, in the case that we do not know whether typical degrees satisfy
the property. An optimal result has been achieved in the cases where the two levels
are consecutive. All of the other cases can be seen as open problems.
As an example, let us examine the join property. By Barmpalias et al. [2] all
2-random degrees satisfy the join property. On the other hand, in [20] it was shown
that all low 1-random degrees fail to satisfy the join property. We do not know the
answer with respect to the weakly 2-random degrees. In terms of genericity, it was
shown in [8] that all 2-generic degrees satisfy the join property. This was extended
in [2] (via a different argument) to all 1-generic degrees. On the other hand, by
Kurtz [18, 19] every hyperimmune degree4 contains a weakly 1-generic set. Hence
the join property fails for some weakly 1-generic degrees.
Another example is the property of being the join of a minimal pair of degrees.
We express this fact by saying that the degree is the “top of a diamond”. It is a rather
straightforward observation that every 1-generic degree is the top of a diamond.
Such basic facts about the generic degrees were established in [8] (an analogous
observation, following from van Lambalgen’s Theorem, the fact that bases for
1-randomness are 02 and the fact that weakly 2-random degrees form a minimal
pair with 00 , is that every weakly 2-random degree is the top of a diamond). On the
other hand, using the fact from [18, 19] that every hyperimmune degree contains a
set which is weakly 1-generic (and a set which is weakly 1-random), it follows that
there are degrees of weakly 1-generic sets (and degrees of weakly 1-random sets)
which are not the top of a diamond.
Many of the results in Table 2 regarding genericity were obtained in [12, 14–16],
solving a number of questions in [8]. For example, it was shown that every 2-generic
degree is a minimal cover. We do not know if there exists a 1-generic degree or a
weakly 2-generic degree which is not a minimal cover. Kumabe also showed that
every 2-generic degree satisfies the complementation property. It is not known if
this can be extended to 1-generic degrees, and the complementation property for
random reals is an open problem.
4
A degree is called hyperimmune if it computes a function which is not dominated by any
computable function.
The Information Content of Typical Reals 219
A general theme in the study of typical reals from an algorithmic point of view,
is that information introduces order and hence makes reals special and less typical.
In other words, a degree that has high information content (e.g. it can compute the
halting problem) fails to be typical. More precisely, 1-generic reals are incomplete
(i.e. fail to compute the halting problem) and weakly 2-random reals are incomplete.
There are a number of results that support this intuition, both in terms of genericity
and in terms of randomness. Another more sophisticated example from [31] is that
weakly 2-random reals cannot compute a complete extension of Peano arithmetic.
Despite these facts, it turns out that there is no bound on the information that joins
of typical reals can have. Since we have not found this basic fact in the literature,
we present it here.
Theorem 3.1 Let V be a null or a meager set of degrees, and let d be a degree.
Then there exist degrees x; y which are not in V and d < x _ y.
Proof Let D; X; Y denote representatives of the degrees d; x; y respectively. Since
there are no maximal degrees, it suffices to show that d x _ y. Let V be the union
of the degrees in V. Suppose that V is null, so that V is a null set of reals. Then
there exists a closed set P of positive measure that is disjoint from V. It is a basic
fact from measure theory that P contains a real Z which has an indifferent set of
digits .si / with respect to P, in the sense that any modification in the bits of Z on
positions si results in a real which is in P. Algorithmic refinements of this fact were
studied in [7]. Without loss of generality let us assume that .si / is increasing. For
each t 2 ! which is not a term of .si / define X.t/ D Y.t/ D Z.t/. Moreover for each
i let X.si / D D.i/ and Y.si / D 1 D.i/. Then clearly X ˚ Y can compute D.
The case when V is meager is similar, based on the fact that every comeager
set of reals contains a real which is indifferent in P with respect to a sequence of
positions. Algorithmic versions of this fact were studied in [4]. t
u
Is it reasonable to expect that the properties of typical degrees also hold for the
non-zero degrees that they bound? This is equivalent to the expectation that the
upward closure of any “small” class of degrees that does not contain 0 is “small”.
This is not true in general, although the upper cone of degrees above any given non-
zerodegree is both meager and null. Martin’s category theorem (from Martin, 1967,
Measure, category, and degrees of unsolvability, Unpublished manuscript, see [8])
says that if C is a meager downward closed set of degrees then the upward closure
of C f0g is meager. On the other hand if a meager class C is not downward closed
then the upward closure of C f0g can be comeager (see [8]). An application of
Martin’s category theorem was that the set of degrees that bound minimal degrees is
meager. An analogous result in terms of measure was shown in [26]. In particular,
220 G. Barmpalias and A. Lewis-Pye
it was shown that if C is the null class of minimal degrees, then the upward closure
of C is also null. However there exist downward closed null classes C of degrees
such that C f0g has measure 1. For example, let C consist of the degrees bounded
by 1-generics. Then C is null and by Kurtz [18] its upward closure has measure
1. Hence the exact analogue of Martin’s theorem in terms of measure is not true.
We are yet to find a counterexample, however, which is “naturally” definable (as a
subset of the Turing degrees, rather than as a set of reals definable in arithmetic),
and we do not know if there is some analogue of Martin’s category theorem in terms
of measure. Let us express this problem in terms of the following, somewhat vague,
question:
Question 3 Which null classes of degrees have null upward closure?
Our discussion shows that whether the non-zeropredecessors of a typical degree
a inherit a property of a depends on the type of the property in question. However
many of the basic degree theoretic properties that are known to hold for generic
degrees are also known to hold for the nonzero degrees that are bounded by generic
degrees. A number of results in [8] follow this heuristic principle. For example,
the cupping and join properties are satisfied by all non-zero degrees bounded by
any 2-generic degree. Curiously enough, the same phenomenon is common in the
random degrees. For example, the join property is shared by all non-zero degrees
that are bounded by a 2-random degree, and the cupping property fails for all such
degrees. This observation also relates to many of the arguments that are used to
obtain these results. In [2] we presented a methodology for examining whether
a property is typical of a random degree, which extends to a methodology for
examining whether the property is typical of a nonzero degree which is bounded
by a random degree. Usually, the latter argument tends to be more involved than
the former, but both rest on the same ideas. This methodology rests on the work of
other authors, for example [26] where it was shown that sufficiently random degrees
do not bound minimal degrees. However it is refined considerably, which allows to
obtain more precise classifications like the fact that 2-random degrees do not bound
minimal degrees. Such results are often shown to be tight by providing counter-
examples for lower levels of typicality. For example, it was shown that there are
weakly 2-random degrees that bound minimal degrees.
In Table 3 we display the properties that are known to hold for all non-zero
degrees that are bounded by typical degrees. The reader may observe that most
of the properties of Table 2 that hold for typical degrees also hold for the non-
zero degrees that are bounded by typical degrees (in fact we do not have natural
counter-examples). For example, 2-random degrees all have strong minimal covers,
as do all non-zero degrees that are bounded by 2-randoms. The fact that all non-zero
degrees bounded by 2-randoms satisfy join, however, implies that there are no strong
minimal covers below any 2-random degree. Weak 2-randoms do sometimes bound
strong minimal covers, since they sometimes bound minimal degrees. These results
are from [2]. In the same paper it was shown that every degree that is bounded by a
2-random degree is the top of a diamond. This result can be seen as a weak version
The Information Content of Typical Reals 221
Table 3 Validity of the properties, for a nonzerodegree which is bounded above by a typical
degree
Bounded by a generic Bounded by a random
Properties Validity Level Validity Level
Join ✓ 2 ✓ 2
Meet ? ? ? ?
Cupping ✓ 2 ✗ 2
Complementation ? ? ? ?
Top of a diamond ? ? ✓ 2
Being a minimal degree ✗ 2 ✗ 2
Bounding a minimal degree ✗ 2 ✗ 2
Being a minimal cover ? ? ? ?
Being a strong minimal cover ✗ 2 ✗ 2
Having a strong minimal cover ✗ 2 ✓ 2
of the complementation property. The latter as well as the meet property are open
problems, as we indicate in Tables 2 and 3.
In terms of genericity, we do not know whether complementation is satisfied
by all non-zero degrees bounded by 2-generics, or even whether such degrees will
always satisfy the meet property (although the latter may not be a difficult problem).
We do not know whether every non-zero degree that is bounded by a 2-generic
degree is a minimal cover.
A discussion comparing randomness and genericity can be found in [5, Sect. 8.20].
As reals, generics and randoms are certainly very different. For example, their
Turing degrees form disjoint classes. In fact, no 1-random real is computable in
a 1-generic. Furthermore, every generic degree forms a minimal pair with every
sufficiently random degree, and this already happens on the level of 2-generics and
2-randoms (see [24]). A result which reveals additional relations between the two
notions, was proved in [2]. It was shown that every nonzerodegree that is bounded
by a 2-random degree is the join of a minimal pair of 1-generic degrees.
An inspection of Table 2 shows that all of the properties that we have considered
are decided by level 2 of the genericity hierarchy. Moreover, some of them are even
decided earlier, for example the property of having a strong minimal cover which
does not hold for any weakly 2-generic degree (see [2, Sect. 8.1]) but there are
some 1-generic degrees which satisfy it (by Kumabe [16]). For the case of random
degrees, there are some unknown cases, but most of the properties that we consider
are also decided on the second level of the hierarchy of randomness.
222 G. Barmpalias and A. Lewis-Pye
The following question therefore comes into focus. Is it possible that there is a
finite level Hn of the hierarchy which is sufficient to decide all sentences ' for the
lower cone, in the sense that 8x 2 Hn ; D. x/ ˆ ' or 8x 2 Hn ; D. x/ ˆ
:'? This question was raised in an early draft of Barmpalias et al. [2, Sect. 12]
and for the case of generic degrees it was independently raised by Jockusch much
earlier (personal communication with Richard Shore). It was recently answered in
the negative.
Theorem 4.1 ([29]) There are sentences 'n such that, for n > 2, D. x/ ˆ 'n for
every .n C 1/-generic or .n C 1/-random x but such that D. x/ 6ˆ 'n for some
n-generics and n-randoms.
We note that the sentences 'n are the same for the generic and the random
case. Moreover, they are obtained via a process of interpreting arithmetic inside
the corresponding degree structures. As a result of this, they are not considered
“natural” and one may consider the task of discovering familiar properties which
separate at least the first few levels of the randomness and genericity hierarchies.
Another issue that was raised in [2, Sect. 12] was whether the theory below an
arithmetically generic degree is the same as the theory below an arithmetically
random degree. This question makes sense, since given any two arithmetically
generic degrees x; y, the theories of the structures D. x/ and D. y/ are equal.
Similarly the theories of the lower cone for any two arithmetically random degrees
are equal. An inspection of Table 2 shows that the only properties there where the
generic and random degrees differ, are the cupping property and the property of
having a strong minimal cover. These, however, are not properties pertaining to the
lower cone, so they do not provide an answer to our question. An answer was given
in [29], again via the methodology of coding:
Theorem 4.2 ([29]) There is a sentence ' such that D. x/ ˆ ' for every
3-random degree x and D. x/ ˆ :' for every 3-generic degree x.
We note that, as with the case of Theorem 4.1, the question remains as to whether
one can find “natural” examples of sentences ' which separate the theories of
the lower cones below arithmetically random and arithmetically generic degrees.
Table 2 points to some candidates, for example the complementation and meet
properties, and the property of being a minimal cover.
Acknowledgements Barmpalias was supported by the 1000 Talents Program for Young Scholars
from the Chinese Government, and the Chinese Academy of Sciences (CAS) President’s Interna-
tional Fellowship Initiative No. 2010Y2GB03. Additional support was received by the CAS and the
Institute of Software of the CAS. Partial support was also received from a Marsden grant of New
Zealand and the China Basic Research Program (973) grant No. 2014CB340302. Lewis-Pye was
559 formally named Lewis, and was supported by a Royal Society University Research Fellowship.
The Information Content of Typical Reals 223
References
29. R. Shore, The Turing degrees below generics and randoms. J. Symb. Log. 79, 171–178 (2014)
30. C. Spector, Measure-theoretic construction of incomparable hyperdegrees. J. Symb. Log. 23,
280–288 (1958)
31. F. Stephan, Martin-löf random and PA-complete sets, in Logic Colloquium ’02. Lecture Notes
in Logic, vol. 27 (Association for Symbolic Logic, La Jolla, 2006), pp. 342–348
32. A.M. Turing, Systems of logic based on ordinals. Proc. Lond. Math. Soc. 45, 161–228 (1939)
33. C.E.M. Yates, Banach-Mazur games, comeager sets and degrees of unsolvability. Math. Proc.
Camb. Philos. Soc. 79, 195–220 (1976)
Proof Theoretic Analysis by Iterated Reflection
L.D. Beklemishev
Abstract Progressions of iterated reflection principles can be used as a tool for the
ordinal analysis of formal systems. Moreover, they provide a uniform definition of
a proof-theoretic ordinal for any arithmetical complexity …0n . We discuss various
notions of proof-theoretic ordinals and compare the information obtained by means
of the reflection principles with the results obtained by the more usual proof-
theoretic techniques. In some cases we obtain sharper results, e.g., we define
proof-theoretic ordinals relevant to logical complexity …01 .
We provide a more general version of the fine structure relationships for iterated
reflection principles (due to Ulf Schmerl). This allows us, in a uniform manner, to
analyze main fragments of arithmetic axiomatized by restricted forms of induction,
including I†n , I†
n , I…n and their combinations.
We also obtain new conservation results relating the hierarchies of uniform and
local reflection principles. In particular, we show that (for a sufficiently broad class
of theories T) the uniform †1 -reflection principle for T is †2 -conservative over
the corresponding local reflection principle. This bears some corollaries on the
hierarchies of restricted induction schemata in arithmetic and provides a key tool
for our generalization of Schmerl’s theorem.
This article is a reprint of [6] with a new sect. 1 “Preliminary Notes” added.
L.D. Beklemishev ()
Steklov Mathematical Institute, Gubkina 8, 119991 Moscow, Russia
e-mail: bekl@mi.ras.ru
1 Preliminary Notes
A.M. Turing’s 1939 paper System of logics based on ordinals, based on his Princeton
Ph.D. Thesis, is one of the longest but perhaps lesser-known among his various
contributions. It is mainly cited because, almost as a side remark, he gave in it a
definition of computability relative to an oracle or an oracle machine, a technical
notion that proved to be fundamentally important in the subsequent development of
the theory of recursive functions and degrees of unsolvability.1 However, the main
topic of that paper, its motivations and, mostly negative, results belong to a different
part of logic, proof theory. They are in many respects remarkable, but at the same
time sufficiently complicated to be easily explained to a non-specialist.
A detailed survey of Turing’s pioneering paper, which includes a very useful
restatement of Turing’s results in the modern language, has been provided by
Feferman [11]. Therefore, here my comments are restricted to necessary minimum
intended to explain the relationships between Turing’s work and my paper reprinted
in this volume.
The subject of ordinal logics or, as S. Feferman later called them, transfinite
recursive progressions of axiomatic systems, emerges as inevitably as any good
mathematical notion. Practically anyone thinking about the meaning of Gödel’s
incompleteness theorems sooner or later comes to the question: “If, as Gödel claims,
the consistency of a (consistent) formal system S is unprovable in S itself,
S ° Con.S/;
Of course, the “depth” Turing mentions here must not be equated with the depth
of mathematical proofs as understood in the ordinary discourse. Even elementary
proofs formalizable in some basic system S can be deeper than trivial proofs
1
See also Rathjen [30] for some uses of oracles in proof theory.
Proof Theoretic Analysis by Iterated Reflection 227
In other words, we would only know that a number a such that Sa proves Fermat’s
last theorem denotes an ordinal if we already knew that this theorem were true. Thus,
ordinal logics do not really allow to “overcome” the incompleteness phenomenon,
but only shift the problem to the one of recognizing artificial ordinal descriptions.
A major contribution to the study of recursive progressions was the already
quoted work of Feferman [9] who gave them a thorough and technically clean
treatment based on the use of Kleene’s system of ordinal notation O. He showed,
among other things, that the progression based on the iteration of the unrestricted
uniform reflection principle (as opposed to the local reflection principle considered
2
These are his notations for recursive well-orderings.
228 L.D. Beklemishev
This passage describes fairly well what is going on in [6]: We sacrifice the
completeness of ordinal logics in favor of invariance. The price that we pay
is that we are only able to classify arithmetical sentences within some specific
intervals of theories. To achieve rather delicate results technically we have to
deal with more restricted kinds of progressions than in [9], the so-called smooth
progressions. Nevertheless, many things have become easier than in Feferman’s
work. In particular, we do not have to apply a formalization of recursion theorem in
arithmetic to construct recursive progressions but instead use simpler arithmetical
fixed point arguments.
In [6] we mainly treat the most basic interval of fragments of arithmetic between
the elementary arithmetic EA and Peano arithmetic PA. Various theories within this
3
These formulas define initial segments of ˛.
Proof Theoretic Analysis by Iterated Reflection 229
interval have been extensively studied over the years using both model-theoretic
and proof-theoretic methods. By now the zoo of considered fragments of PA
within this interval has become rather large. The main fragments are defined by
restricting the arithmetical complexity of the induction formula and by modifying
in various ways the formulation of induction, for example, one often considers
parameter-free forms of induction and the rule forms. Some schemata alternative to
induction, such the collection principle, also give rise to new and important series of
fragments of PA (see [14] for a survey). The use of progressions of iterated reflection
principles allows for a systematic treatment of the most important fragments, and
for a formulation of their basic relationships from a unified perspective. In this
development, recursive progressions play the role of a standard measuring stick
against which we compare various systems. This is what one would usually expect
from a meaningful classification.
The approach taken in this paper is, therefore, rather far from philosophical
concerns, but is aimed at finding optimal (the strongest and the most economical)
formulations of various results in the proof theory of arithmetic. How it relates to
the other existing notions of proof-theoretic ordinals is explained in the introduction
to [6].
2 Introduction
4
It is a long standing open question, whether a natural ordinal notation system can be canonically
chosen for sufficiently large constructive ordinals. It has to be noted, however, that the standard
proof-theoretic methods, in practical cases, usually allow to define natural ordinal notation systems
for suitable initial segments of the constructive ordinals, that is, they simultaneously allow for …11 -
and …02 -analyses of a theory, whenever they work. Pohlers [27] calls this property profoundness of
the ordinal analysis.
Proof Theoretic Analysis by Iterated Reflection 231
Hilbert’s program. On the other hand, an independent interest in …01 -analysis is its
relationship with the concept of relative interpretability of formal theories. By the
results of Orey, Feferman and Hájek, this notion (for large classes of theories) is
equivalent to …01 -conservativity. The proposals to define general notions of proof-
theoretic …01 -ordinals, however, generally fell victim to just criticism, see [18]. To
refresh the reader’s memory, we discuss one such proposal below.
Indoctrinated by Hilbert’s program, Gentzen formulated his ordinal analysis of
Peano arithmetic as a proof of consistency of PA by transfinite induction up to 0 .
Accordingly, a naive attempt at generalization was to define the …01 -ordinal of a
system T as the order type of the shortest primitive recursive well-ordering such
that the corresponding scheme of transfinite induction TI. / proves Con.T/.
This definition is inadequate for several reasons. The first objection is that the
formula Con.T/ may not be canonical, that is, it really depends on the chosen
provability predicate for T rather than T itself. Feferman [8] gave examples of
†1 -provability predicates externally numerating PA and satisfying Löb’s deriv-
ability conditions such that the corresponding consistency assertions are not PA-
provably equivalent. In Appendix 2 we consider another example of this sort,
for which the two provability predicates correspond to sufficiently natural proof
systems axiomatizing PA. This indicates that the intended …01 -ordinal of a theory T
can, in fact, be a function of its provability predicate (and possibly some additional
data), rather than just of the set of axioms of T taken externally.5 Two possible ways
to avoid this problem are: (1) to restrict the attention to specific natural theories, for
which the canonical provability predicates are known; (2) to stipulate that theories
always come together with their own fixed provability predicates. In other words, if
two deductively equivalent axiom systems are formalized with different provability
predicates, they should be considered as different. As remarked above, the second
option appears to be better than the first, and we stick to it in this paper.
The second objection is that the primitive recursive well-ordering may be
sufficiently pathological, and then TI. / can already prove Con.T/ for some well-
ordering of type ! (as shown by Kreisel). This problem can be avoided, if we only
consider natural primitive recursive well-orderings, which are known for certain
initial segments of the constructive ordinals. This would make the definition work
at least for certain classes of theories, whose ordinals are not too large. Notice
that essentially the same problem appears in the definition of the proof-theoretic
…02 -ordinal described above.
The third objection is that, although Con.T/ is a …01 -formula, the logical com-
plexity of the schema TI. / is certainly higher. Kreisel noticed that the formulation
of Gentzen’s result would be more informative, if one restricts the complexity of
transfinite induction formulas to primitive recursive, or open, formulas (we denote
this schema TIp:r: . /). That is, Gentzen’s result can be recast as a reduction of !-
induction of arbitrary arithmetical complexity to open transfinite induction up to 0 .
5
This is also typical for the other attempts to define proof-theoretic ordinals “from above” (cf.
Appendix 2 for a discussion).
232 L.D. Beklemishev
This formulation allows to rigorously attribute to, say, PA the natural ordinal
(notation system up to) 0 . However, for other theories T this approach is not yet
fully satisfactory, for it is easy to observe that TIp:r: . / has logical complexity …02 ,
which is higher than …01 . So, the definition of jTj…0 as the infimum of order types
1
of natural primitive recursive well-orderings such that
in fact, reduces a …01 -principle to a …02 -principle. The opposite reduction, however,
is not possible. Thus, the ordinals obtained are not necessarily ‘the right ones’. For
example, in this sense the ordinal of PA C Con.PA/ happens to be the same number
0 , whereas any decent …01 -analysis should separate the system from PA. One can
attempt to push down the complexity of TIp:r: . / by formulating it as a transfinite
induction rule and disallowing nested applications of the rule, but in the end this
would look less natural than the approach proposed in this paper.
Proof-Theoretic Analysis by Iterated Reflection The aim of this paper is to
present another approach to proof-theoretic …01 - and, in general, …0n -analysis for any
n 1. The treatment of arbitrary n is not substantially different from the treatment
of n D 1. For n D 2 our definition is shown to agree with the usual …02 -analysis
w.r.t. the Fast Growing hierarchy.6 The apparent advantage of the method is that
for the ‘problematic’ cases, such as PA C Con.PA/, one obtains meaningful ordinal
assignments. For example, we will see that jPA C Con.PA/j…0 D 0
2, which is
1
well above the …01 - and …02 -ordinal 0 of PA, as expected.
A basic idea of the proof-theoretic …0n -analysis is that of conservative approxi-
mation of a given theory T by formulas of complexity …0n whose behavior is well-
understood. Many properties of T, e.g., its class of p.t.c.f., can be learned from the
known properties of the conservative approximations. As suitable approximations
we take progressions of transfinitely iterated reflection principles (of relevant logical
complexity). In particular, progressions of iterated consistency assertions, which are
equivalent to iterated …01 -reflection principles, provide suitable approximations of
complexity …01 .
The choice of the reflection formulas as the approximating ones has the following
two advantages. First of all, the hierarchies of reflection principles are natural
analogs of the jump hierarchies in recursion theory (this analogy is made more
precise in Sect. 4 of this paper). So, in a sense, they are more elementary than the
other candidate schemata, such as transfinite induction. Second, and more important,
they allow for a convenient calculus. That is, the proof-theoretic ordinals for many
theories can be determined by rather direct calculations, once some basic rules of
handling iterated reflection principles are established. The key tool for this kind of
calculations is Schmerl’s formula [32], which is generalized and provided a new
proof in this paper.
6
This can be considered as an evidence supporting our definition for the other n.
Proof Theoretic Analysis by Iterated Reflection 233
The idea of using iterated reflection principles for the classification of axiomatic
systems goes back to the old works of Turing [37] and Feferman [9]. Given a base
theory T, one constructs a transfinite sequence of extensions of T by iteratedly
adding formalized consistency statements, roughly, according to the following
clauses:
(T1) T0 D T;
S T˛ C Con.T˛ /;
(T2) T˛C1 D
(T3) T˛ D ˇ<˛ Tˇ , for ˛ a limit ordinal.
By Gödel’s Incompleteness Theorem, whenever the initial theory T is sound,7 the
theories T˛ form a strictly increasing transfinite sequence of sound …01 -axiomatized
extensions of T. Choosing for T some reasonable minimal fragment of arithmetic
(in this paper we work over the elementary arithmetic EA) this sequence can be used
to associate an ordinal jUj…0 to any theory U extending EA as follows:
1
This definition provides interesting information only for those theories U which
can be well approximated by the sequence EA˛ . For such U one should be
able to show that for ˛ D jUj…0 the theory EA˛ axiomatizes all arithmetical
1
…01 -consequences of U, that is,
U …1 EA˛ : (1)
(Here and below T …n U means that the theories T and U prove the same
…n -sentences.) Thus, (1) can be viewed as an exact reduction of U to a purely
…01 -axiomatized theory EA˛ , and in this sense jUj…0 is called the proof-theoretic
1
…01 -ordinal of U. Theories U satisfying equivalence (1) are called …01 -regular.
Verifiability of (1) within, say, EA implies
EA ` Con.U/ $ Con.EA˛ /;
and thus, jUj…0 can also be thought of as the ordinal measuring the consistency
1
strength of the theory U.
The program as described above, however, encounters several technical difficul-
ties. One familiar difficulty is the fact that the clauses (T1)–(T3) do not uniquely
define the sequence of theories T˛ , that is, the theory T˛ depends on the formal
representation of the ordinal ˛ within arithmetic rather than on the ordinal itself.
For the analysis of this problem Feferman [9] considered families of theories
of the form .Tc /c2O satisfying (T1)–(T3) along every path within O, where O is
Kleene’s universal system of ordinal notation. Using an idea of Turing, he showed
7
That is, if all theorems of T hold in the standard model of arithmetic.
234 L.D. Beklemishev
that every true …01 -sentence is provable in Tc for a suitable ordinal notation c 2 O
with jcj D ! C 1. It follows that there are two ordinal notations a; b 2 O with
jaj D jbj D ! C 1 such that Ta proves Con.Tb /, and this observation seems to break
down the program of associating ordinals to theories as described above, at least in
the general case.
However, a possibility remains that for natural (mathematically meaningful)
theories U one can exhaust all …01 -consequences of U using only specific natural
ordinal notations, and a careful choice of such notations should yield proper ordinal
bounds. This idea has been developed in the work of Schmerl [32], who showed
among other things that for natural ordinal notations
PA …1 PRA0 :
This essentially means that jPAj…0 D 0 , which coincides with the ordinal
1
associated to PA through other proof-theoretic methods.
The significant work of Schmerl, however, attracted less attention than it, in our
opinion, deserved. Partially this could be explained by a rather special character of
the results, as they were stated in his paper. At present, 20 years later, thanks to the
development of provability logic and formal arithmetic, we know much more about
the structure of the fragments of PA, as well as about the properties of provability
predicates. One of the goals of this paper is to revise and put in the right context this
work of Schmerl. We provide a simpler approach to defining and treating iterated
reflection principles, which helps to overcome some technical problems and allows
for further development of these methods.
Plan of the Paper In Sect. 3 we define progressions of iterated reflection principles
and note some basic facts about them. This allows to rigorously define …0n -ordinals
of theories following the ideas presented in the introduction.
In Sect. 4 we relate, in a very general setup, the hierarchy of iterated …02 -reflection
principles and the Fast Growing hierarchy. This shows that our approach, for the
particular case of logical complexity …02 , agrees with the usual proof-theoretic
…02 -analysis and provides the expected kind of information about the classes
of provably total computable functions. Proofs of some technical lemmata are
postponed till Appendix 1.
Section 5 can be read essentially independently from the previous parts of
the paper. It presents a new conservation result relating the uniform and local
reflection schemata. In particular, it is shown that uniform …2 -reflection principle is
†2 -conservative over the local †1 -reflection principle. This yields as an immediate
corollary the result in [15] on the relation between parametric and parameter-free
induction schemata: I†n is †nC2 -conservative over I† n . The results of that section
also provide a clear proof of a particular case of Schmerl’s formula, which already
has some meaningful corollaries for fragments of PA. At the same time it serves as
a basis for a generalization given in the further sections.
Section 6, aiming at a proof of Schmerl’s formula, presents a few lemmata
to the effect that some conservation results for noniterated reflection principles
Proof Theoretic Analysis by Iterated Reflection 235
In defining iterated reflection principles we closely follow [3]. Our present approach
is slightly more general, but the proofs of basic lemmas remain essentially the same,
so we just fix the terminology and indicate some basic ideas.
Iterated Consistency Assertions We deal with first order theories formulated in
a language containing that of arithmetic. Our basic system is Kalmar elementary
arithmetic EA (or I0 .exp/, cf. [14]). For convenience we assume that a symbol
for the exponentiation function 2x is explicitly present in the language of EA. EAC
denotes the extension of EA by an axiom stating the totality of the superexponential
function 2xx (or I0 C supexp). EAC is the minimal extension of EA where the cut-
elimination theorem for first order logic is provable. Hence, it will often play the
role of a natural metatheory for various arguments in this paper.
Elementary formulas are bounded formulas in the language of EA. A theory T
is elementary presented if it is equipped with a numeration, that is, an elementary
formula AxT .x/ defining the set of axioms of T in the standard model of arithmetic.
By an elementary linear ordering .D; / we mean a pair of elementary formulas
x 2 D and x y such that EA proves that the relation linearly orders the domain
D. An elementary well-ordering is an elementary linear ordering, which is well-
founded in the standard model.
Given an elementary linear ordering .D; /, we use Greek variables ˛; ˇ;
; etc.
to denote the elements of D (and the corresponding ordinals). Since D is elementary
definable, these variables can also be used within EA.
An elementary formula AxT .˛; x/ numerates a family of theories .T˛ /˛2D , if for
each ˛ the formula AxT .˛;N x/ defines the set of axioms of T˛ in the standard model.
If such a formula AxT exists, the family .T˛ /˛2D is called uniformly elementary
presented.
236 L.D. Beklemishev
From AxT .˛; x/, as well as from a numeration of an individual theory, the
(parametric) provability predicate T .˛; x/ and the consistency assertion Con.T˛ /
are constructed in a standard way. Specifically, there is a canonical †01 -formula
P.X; x/, with a set parameter X and a number parameter x, expressing the fact that
x codes a formula logically provable from the set of (non-logical) axioms coded by
X. Then T .˛; x/ WD P.fu W AxT .˛; u/g; x/ and Con.T˛ / WD :T .˛; ?/. Notice
that T .˛; x/ is first order †1 , and AxT .˛; u/ occurs in T .˛; x/ as a subformula
[replacing the occurrences of the form u 2 X in P.X; x/].
As usual, we write T .˛; '/ instead of T .˛; p'q/, and T .˛; '.Px// instead
of T .˛; p'.Px/q/. Here p'.Px/q denotes the standard elementary function (and the
corresponding EA-definable term) that maps a number n to the code of the formula
'.Nn/.
Now we present progressions of iterated consistency assertions. Somewhat
generalizing [3], we distinguish between explicit and implicit progressions. Both are
defined by formalizing (in two different ways) the following variant of conditions
(T1)–(T3): for all ˛ 2 D,
T˛ T C fCon.Tˇ / W ˇ ˛g:
8
This is essentially the only place in all the development below, where well-foundedness matters.
Actually, for the progressions based on iteration of consistency, well-foundedness w.r.t. the
†2 -definable subsets would be sufficient.
Proof Theoretic Analysis by Iterated Reflection 237
Lemma 1 (Existence) For any elementary linear ordering .D; / and any initial
theory T, there is an explicit progression based on iteration of consistency along
.D; /.
Proof The definition (2) has the form of a fixed point equation. Indeed, the formula
Con.Tˇ / is constructed effectively from AxT .ˇ; x/, essentially by replacing x by u
and substituting the result into :P.X; ?/ for u 2 X. Hence, there is an elementary
definable term con that outputs the Gödel number of Con.Tˇ / given the Gödel
number of AxT .ˇ; x/. Then Eq. (2) can be rewritten as follows:
P x/q// :
EA ` AxT .˛; x/ $ AxT .x/ _ 9ˇ x .ˇ ˛ ^ x D con.pAxT .ˇ; (4)
Fixed point lemma guarantees that an elementary solution AxT .˛; x/ exists. To
see that the solution satisfies (2) it only has to be noted that, assuming the Gödel
numbering we use is standard, provably in EA for any ˇ,
In the following, the words progression based on iteration of consistency will
always refer to implicit progressions. We note the obvious monotonicity property of
such progressions:
Lemma 2 EA ` ˛ ˇ ! .T .˛; x/ ! T .ˇ; x//:
The next lemma shows that any progression based on iteration of consistency is
uniquely defined by the initial theory and the elementary linear ordering.
Lemma 3 (Uniqueness) Let U and V be elementary presented extensions of EA,
.D; / an elementary linear ordering, .U˛ /˛2D and .V˛ /˛2D progressions based on
iteration of consistency with the initial theories U and V, respectively. Then
T ` T 8˛ A.˛/ ! 8˛ T 8ˇ ˛P A.ˇ/
! 8˛ A.˛/:
Partial reflection principles are obtained from the above schemata by imposing
a restriction that ' belongs to one of the classes
of the arithmetical hierarchy
(denoted Rfn
.T/ and RFN
.T/, respectively). See [3, 19, 35] for some basic
information about reflection principles.
We shall also consider the following metareflection rule:
'
RR…n .T/ W :
RFN…n .T C '/
We let …m -RR…n .T/ denote the above rule with the restriction that ' is a
…m -sentence.
For n 1, …n .N/ denotes the set of all true …n -sentences. True…n .x/ denotes a
canonical truth definition for …n -sentences, that is, a …n -formula naturally defining
the set of Gödel numbers of …n .N/-sentences in EA.
Let T be an elementary presented theory containing EA. The set of axioms of
the theory T C …n .N/ can be defined, e.g., by the …n -formula AxT .x/ _ True…n .x/.
Then the formula
…
T .x/ WD P.fu W AxT .u/ _ True…n .u/g; x/
n
Proof Theoretic Analysis by Iterated Reflection 239
naturally represents the †nC1 -complete provability predicate for T C …n .N/, and
Con…n .T/ WD :… T ? is the corresponding consistency assertion. (For n D 0 we
n
EA ` 8x ..x/ ! …
T .P
n
x//;
for any †nC1 -formula .x/. Besides, for n 0 the following relationships are
known (see [4]).
Lemma 6 For any elementary presented theory T containing EA, the following
schemata are equivalent over EA:
(i) Con…n .T/I
(ii) RFN…nC1 .T/I
(iii) RFN†n .T/:
This shows that the uniform reflection principles are generalizations of the con-
sistency assertions to higher levels of the arithmetical hierarchy. (Notice that the
schema RFN…1 .T/ is equivalent to the standard consistency assertion Con.T/.)
Relativized local reflection principles are generally not equivalent to any of the
previously considered schemata. They are defined as follows:
Rfn…
†m .T/ W
n
…
T ' ! ';
n
for ' 2 †m ;
and similarly for the local …m -reflection principle. Notice that the relativized analog
of, say, Rfn†m .T/ is actually Rfn…†mCn .T/.
n
This can be done as follows. Since the instances of the reflection principles are
elementarily recognizable, with each of the above schemata ˆ one can naturally
associate an elementary formula ˆ-code.e; x/ expressing that e is the code of a †1 -
formula U .v/, and x is the code of an instance of ˆ.U/ formulated for U . Then
AxT .˛; x/ is called an explicit numeration of a progression based on iteration of
ˆ, if
P v/q; x//;
EA ` AxT .˛; x/ $ .AxT .x/ _ 9ˇ ˛ ˆ-code.pT .ˇ;
240 L.D. Beklemishev
Then the analogs of existence, monotonicity, and uniqueness lemmas hold for such
progressions too, with similar proofs. We omit them.
…0n -Ordinals Let an elementary well-ordering .D; / be fixed. All the definitions
below are to be understood relative to this ordering. We define:
If the ordering .D; / is too short, that is, if for all ˛ 2 D, .EA/n˛ T, we can set
jTj…0n WD 1.
A theory T is …0n -regular, if there is an ˛ 2 D such that
T …n .EA/n˛ : (5)
Notice that …0n -regular theories are …0n -sound, because .EA/n˛ is. If the equiva-
lence (5) is provable in a (meta)theory U, then T is called U-provably …0n -regular.
For U T in this case we have ˛ D jTj…0n , because the formalization of (5) implies
.v/
F˛ .x/ WD maxf2xx C 1g [ fFˇ .u/ C 1 W ˇ ˛; ˇ; u; v xg: ./
Since .D; / is well-founded, all F˛ are well-defined. The functions F˛ generate the
hierarchy of function classes
F˛ WD E.fFˇ W ˇ ˛g/:
One easily verifies that for the initial elements ˛ 2 D the classes F˛ coincide
with the classes of the familiar Grzegorczyk hierarchy: F0 D E, F1 D E 0 , . . . ,
F! D primitive recursive functions, . . . The further classes are a natural extension
of the Grzegorczyk hierarchy into the transfinite. Notice that this hierarchy is defined
for an arbitrary (not necessarily natural) well-ordering and does not depend on the
assignments of fundamental sequences.
A slight modification of this hierarchy has recently been proposed by Weiermann
and studied in detail by Möllerfeld [22]. Building on some previous results, see
[31] for an overview, he relates this hierarchy to some other natural hierarchies of
function classes. Since our hierarchy has to be reasonably representable in EA, in
some respects we need a sharper treatment than in [22].
Proofs of the following two lemmas will be given in the Appendix.
Lemma 7 F˛ .x/ D y is an elementary relation of ˛, x, and y.
Notice that a priori we only know that this relation is recursive. Let F˛ .x/ ' y be a
natural elementary formula representing it.
Lemma 8 The following properties are verifiable in EA:
(i) .x1 x2 ^ F˛ .x1 / ' y1 ^ F˛ .x2 / ' y2 / ! y1 y2 ;
242 L.D. Beklemishev
Our aim is to prove that .S˛ /˛2D is deductively equivalent to the progression of
iterated uniform …2 -reflection principles over EA.
Notice that S0 EA, and S˛ contains EAC for ˛ 0.
Theorem 1 Provably in EAC , 8˛ S˛ .EA/2˛ :
As a corollary we obtain the following statement.
Corollary 9 For all ˛ 2 D, F ..EA/2˛ / D F˛ .
Proof Obviously, symbols for all functions Fˇ for ˇ ˛ can be introduced
into the language of S˛ . The corresponding definitional extension of S˛ admits a
purely universal axiomatization, because the graphs of all Fˇ are elementary. By
Herbrand’s theorem
which coincides with the class F˛ , since all functions Fˇ are monotone and have
elementary graphs.
Proof of Theorem 1 By the uniqueness lemma it is sufficient to establish within
EAC that .S˛ /˛2Dnf0g is an implicit progression based on iteration of uniform
…2 -reflection principles over EAC . So, we show the following main
Lemma 10 Provably in EAC ,
Proof We formalize the proofs of the following two lemmas in EAC . (Notice that
the arguments are local, that is, they do not use any form of transfinite induction on
˛.)
Lemma 11 Provably in EA,
8ˇ EA C RFN…2 .Sˇ / ` Fˇ #:
.u/
Proof Let Fˇ .x/ ' y abbreviate
EA C RFN…2 .Sˇ / ` 8
ˇ 8x; u9yF
.u/.x/ ' y: (6)
EA ` 8x 9
0 x 8
x .
ˇ !
0 /:
8ˇ ˛ S˛ ` RFN…2 .Sˇ /:
Proof Let Sˇ denote the definitional extension of Sˇ by function symbols for all the
functions fF
W
ˇg. Clearly, Sˇ is a conservative extension of Sˇ , moreover, this
can be shown in EAC uniformly in ˇ. Thus, it is sufficient to prove the lemma for
the theories Sˇ .
First of all, by a standard result (cf. [12] and [2], Proposition 5.11) based on the
monotonicity of the functions Fˇ we obtain that S˛ proves induction for bounded
formulas in the extended language (and this is, obviously, formalizable).
Second, since Herbrand’s theorem is formalizable in EAC , we have that, for any
elementary formula .y; x/,
for some closed term t in Sˇ . So, it is sufficient to establish in S˛ the reflection
principle for Sˇ for open formulas (in the language of Sˇ ). This proof is very similar
to the proof of Theorem 2 in [2], so we only sketch it.
The proof involves two main ingredients. First, we need a natural evaluation
function for terms in the language of Sˇ , that is, a function evalˇ .e; x/ satisfying
.n.e//
evalˇ .e; x/ F
.e/ .x/;
where
.e/ and n.e/ are elementary functions. For the natural coding of terms we
can additionally assume that n.e/ < e, for all e. Then we can estimate the evaluation
function as follows:
.n.e// .max.e;x//
F
.e/ .x/ F
.e/ .max.e; x// Fˇ .max.e; x;
.e///:
; '.s/
;
; :PrfT .t; p:'.Ps/q/
for all terms t, s and formulas '.a/ 2 …n , where p .Ps/q denotes the result of
g
substitution of a term s in the term p .Pa/q. …n -RR…n .T/ will denote the same rule
with the restriction that
consists of …n -formulas.
The following lemma states that the terms p .Ps/q have a natural commutation
property.
Lemma 14 For any term s.Ex/, where the list Ex exhausts all the variables of s, and
any formula .a/ (where s is substitutable in for a) there holds:
EA ` 8Ex T .s.ExP // $ T .Ps/ :
Proof Obviously,
EA ` s.Ex/ D y ! T .s.ExP / D yP /
! T '.s.ExP // $ '.Py/
! T '.s.ExP // $ T '.Py/ (7)
.a/; :.a/
.a/; :PrfT .b; p.Pa/q/
.a/; 8y:PrfT .y; p.Pa/q/
8x .T .Px/ ! .x//:
Let Þ… …n …n
T ' denote :T :'. Notice that for any ', ÞT ' is EA-equivalent to
n
RFN†n .T C '/.
Lemma 16 For n 1, the following rules are equivalent (and even congruent in
the terminology of Beklemishev [2]):
(i) …n -RR…n .T/,
g
(ii) …n -RR…n .T/,
; '.s/
(iii) , for
[ f'g …n .
; Þ…T
n1
'.Ps/
Proof Reduction of (ii) to (iii) is obvious, because
EA ` Þ…
T
n1
'.Ps/ ! ÞT '.Ps/
! :PrfT .t; p:'.Ps/q/:
For a reduction of (iii) to (i) we reason as follows. Let Ex denote the list of all
W free variables in
and s. Notice that, if
…n , then the universal closure of
the
_ '.s/ 2 …n and we can construct the following derivation:
W
1.
.E
Wx/ _ '.s.Ex//
2. 8Ex
.Ex/ _ '.s.Ex//
W
3. Þ… n1
8Ex
.Ex/ _ '.s.Ex// (by …n -RR…n .T/)
T
W P / _ '.s.ExP //
4. 8Ex Þ… n1
. E
x (by Löb’s conditions from 3)
…n1 W W
T
P
5. ÞT .
.Ex// !
.Ex/ (by provable †n -completeness of …
T
n1
)
Proof Theoretic Analysis by Iterated Reflection 247
W
6.
.Ex/ _ Þ… n1
'.s.ExP // (by 4, 5 and Löb’s conditions )
W T
…n1
7.
.Ex/ _ ÞT '.Ps/ (by Lemma 14)
In order to reduce (i) to (ii), for any …n -formula ' and any †n1 -formula .x/
we reason as follows:
This gives the required proof of an arbitrary instance of RFN†n1 .T C '/ from a
derivation of '.
Resuming the proof of Theorem 2 we show that the standard cut-elimination
g
procedure can be considered as a reduction of RFN†n .T/ to …n -RR…n .T/. Consider
a cut-free derivation of a sequent of the form
for some '.x/ 2 …n . Let R' .x; y/ denote the formula in square brackets. We can also
assume that the axioms of U have the form 8x1 : : : 8xm :A.x1 ; : : : ; xm / for some …n -
formulas A.Ex/.
By the subformula property, any formula occurring in the derivation of a sequent
of the form (8) either (a) is a …n -formula, or (b) has the form :RFN†n .T/,
9xR' .t; x/ or R' .t; s/; for appropriate terms s; t, or (c) has the form
for some i < n and terms t1 ; : : : ; ti . Let
denotes the result of deleting all formulas
of types (b) and (c) from
.
Lemma 17 If a sequent
of the form (8) is cut-free provable, then
is provable
from the axioms of U (considered as initial sequents) using the logical rules,
g
including Cut, and the rule …n -RR…n .T/.
Proof goes by induction on the height of the derivation d of
. It is sufficient to
consider the cases that a formula of the form (b) or (c) is introduced by the last
inference in d. Besides, it is sufficient to only consider the formulas of the form
248 L.D. Beklemishev
R' .t; s/ and 9xm A.t1 ; : : : ; tm1 ; xm /, because in all other cases after the application
of .
/ the premise and the conclusion of the rule coincide.
So, assume that the derivation d has the form
and
'.s/; : (10)
Since consists of …n -formulas, the rule …n -RR…n .T/ is applicable to (10), and
g
we obtain a derivation of
Applying the Cut-rule with the sequent (9) we obtain the required derivation of .
If the last inference in d has the form
A.t1 ; : : : ; tm1 ; tm /;
.9/
9xm A.t1 ; : : : ; tm1 ; xm /;
g
then by the induction hypothesis we obtain a …n -RR…n .T/-derivation of
A.t1 ; : : : ; tm1 ; tm /; :
Then a derivation of
is obtained by several applications of the rule .9/. The sequent is now derived
applying the Cut-rule with the axiom sequent 8x1 : : : 8xm :A.x1 ; : : : ; xm /.
Theorem 2 now follows immediately from Lemmas 16 and 17.
Proposition 18 If T is a …nC1 -axiomatized extension of EA, then T C RFN†n .T/
is …n -conservative over .T/n! .
This statement has been obtained (by other methods) for T D PRA in [32], and for
n D 1 and T D EA in [1].
Proof It is sufficient to notice that .T/n! is closed under the rule …n -RR…n .T/.
Proof Theoretic Analysis by Iterated Reflection 249
U C : C RFN†n .T/ ` ?;
and by Theorem 2
U C : C …n -RR…n .T/ ` ?:
Notice that the rule …n -RR…n .T/ is obviously reducible to the schema Rfn…
†n .T/.
n1
Hence, we obtain
U C : C Rfn…
†n .T/ ` ?;
n1
U C Rfn…
†n .T/ ` ;
n1
Thus, from our characterization of parameter-free induction schemata (cf. [2] or
Sect. 8 of this paper) we directly obtain an interesting conservation result due to
Kaye et al. [15] (by a model-theoretic proof).
Corollary 20 For n 1, I†n is a †nC2 -conservative extension of I†
n.
EA ` V ˇ ! Uˇ ;
whence
^
m
T` .Uˇi 'i ! 'i / ! ı:
iD1
^
m
T` .Uˇ 'i ! 'i / ! ı: (11)
iD1
Proof Theoretic Analysis by Iterated Reflection 251
^
m
T` .Uˇ i ! i/ ! ı: (12)
iD1
(Note that these sentences together with a proof of (12) are constructed elementarily
from the proof (11).) The reflexive induction hypothesis implies
EA ` Uˇ i ! V ˇ i;
whence
^
m
T` .V ˇ i ! i/ ! ı:
iD1
The next proposition generalizes Proposition 19. Its proof is based on a for-
malization of Theorem 2, which is possible in EAC , but not in EA itself. (The
nonelementarity is only due to the application of cut-elimination in that proof.)
Proposition 25 If T is a …nC1 -axiomatized extension of EA, then the following is
provable in EAC W
T C RFN†1 .U ˇ / ` :
that is, T C RFN†1 .V ˇ / contains EAC . On the other hand, by the reflexive induction
hypothesis
whence
T C RFN†1 .V ˇ / ` :
It follows that
T C : C RFN†1 .V ˇ / ` ?;
T C : C …1 -RR…1 .V ˇ / ` ?;
and
T C : C Rfn†1 .V ˇ / ` ?:
Thus,
T C Rfn†1 .V ˇ / ` ;
that is, V˛ ` .
Proof Theoretic Analysis by Iterated Reflection 253
7 Schmerl’s Formula
Our approach to Schmerl’s formula borrows a general result from [3] relating
the hierarchies of iterated local reflection principles and of iterated consistency
assertions over an arbitrary initial theory T. This result and the results below
hold under the assumption that .D; / is a nice elementary well-ordering. A
nice well-ordering is an elementary well-ordering equipped with elementary terms
representing the ordinal constants and functions 0; 1; C;
; ! x . These functions
should provably in EA satisfy some minimal obvious axioms NWO listed in [3].
Besides, there should be an elementary EA-provable isomorphism between natural
numbers (with the usual order) and the ordinals !. Under these assumptions on
the class of well-orderings we have the following theorem [3].
Proposition 26 EA proves that, for all ˛; ˇ such that ˛ 1, there holds
.Rfn.T/˛ /ˇ …1 T! ˛ .1Cˇ/ :
Proof One only has to notice that provably in EA, for all ˛,
8˛ 1 .T/nC1
˛ …n .T/n! ˛ :
.T/nC1
˛ †nC1 Rfn…
†n .T/˛ …n .T/! ˛ ;
n1 n
EA ` U V H) EA ` 8˛ .U/n˛ .V/n˛ :
whence
Proof Part (i) follows by m-fold application of Theorem 3. For a proof of (ii) notice
that by (i) and Proposition 25
…n1
.T/nCm
˛ …nC1 .T/nC1
!m1 .˛/ †nC1 Rfn†n .T/!m1 .˛/ :
Proof Theoretic Analysis by Iterated Reflection 255
..T/nCm
˛ /nˇ …n .T/n!m .˛/.1Cˇ/ ;
This theorem directly applies to theories T like EA or PRA. Following Schmerl’s
idea it is also possible to derive from it similar formulas for iterated reflection
principles over PA, which we obtain in the next section.
Corollary 30 If T is a …02 -regular theory, then it is …01 -regular and jTj…0 is one
1
!-power higher than jTj…0 .
2
Now we have at our disposal all necessary tools to give an ordinal analysis of
arithmetic and its fragments. The general methodology bears similarities with
the traditional …11 -ordinal analysis, see [28]. To determine the ordinal of a given
formal system T one first finds a suitable embedding of T into the hierarchy of
reflection principles over EA. Then one applies Schmerl’s formula for a reduction
of the reflection principles axiomatizing T to iterated reflection principles of lower
complexity. From this point of view the use of Schmerl’s formula substitutes the use
of cut-elimination for !-logic. Notice that, although the meaning of the ordinals
is different, ordinal bounds are essentially the same in both approaches. Also
notice that in the present approach the embedding part is more informative. In
particular, this allows to obtain upper and lower bounds for proof-theoretic ordinals
simultaneously.
The following embedding results are known (n 1).
(E1) Leivant [20] and Ono [23] show that I†n is equivalent to RFN…nC2 .EA/ over
EA, that is,
I†n .EA/nC2
1 :
256 L.D. Beklemishev
Notice that this is sharper than the related results in [32] and the original result
by Kreisel and Lévy [19] stating that
PA EA C RFN.EA/:
(E2) For the closures of EA under †n - and …nC1 -induction rules we have the
following characterization (cf. [2]):
Over EA the schema I… 1 is equivalent to the local †2 -reflection principle for
EA formulated for the predicate of cut-free provability, see [5] and Appendix
3 for more details.
Remark 31 The upper bound results only require the embeddings (E1)–(E3) from
left to right, that is, the provability of the induction principles from suitable forms of
the reflection principles. All these embeddings are very easy to prove using a trick
by Kreisel [19]. For example, in order to prove I†n .EA/nC2 1 let .x; a/ be any
†n -formula and consider the formula .x; a/ WD .0; a/ ^ 8u ..u; a/ ! .u C
1; a// ! .x; a/. is logically equivalent to a …nC2 -formula. By an elementary
induction on m it is easy to see that, for all m; k, EA ` .m; k/, and this fact is
formalizable in EA. Hence EA ` 8x; a EA .Px; aP /, and applying uniform …nC2 -
reflection yields
I†n C I…
nC1 …nC1 .I†n /!
nC1
..EA/nC2
1 /! :
nC1
Applying Theorem 4 yields the following corollary, which determines its …02 - and
…01 -ordinals.
Proposition 37 I†n C I… 2
nC1 …2 .EA/!n .2/ …1 EA!nC1 .2/ :
Notice that by (E3)(c) the theory EAC C I… 1 is certainly not …2 -conservative
(not even …1 -conservative) over EAC . Yet, the next proposition shows that its class
of provably total computable functions coincides with that of EAC . This means that
EAC C I… 0
1 is not …2 -regular (its …2 -ordinal equals 1).
Proposition 38
(i) F .EAC C I… C
1 / D F .EA / D F1 ;
(ii) F .EA C I…1 / D F0 D E.
Proof By (E3)(c), EAC C I… C C
1 is contained in EA C Rfn.EA / and similarly
EA C I…
1 EA C Rfn.EA/:
Feferman [8] noticed that Rfn.T/ is provable in T together with all true
…1 -sentences. Yet, it is equally well-known that adding any amount of true …1 -
axioms to a sound theory does not increase its class of provably total computable
functions. This proves both parts (i) and (ii).
Proposition 39 EAC C I… 0 C 2
1 is …1 -regular and jEA C I…1 j…0 D ! . 1
Proof Recall that by Proposition 26 (for this particular case established by Gory-
achev [13]) T C Rfn.T/ is …1 -conservative over T! . Hence, by Theorem 4,
EAC C I… C 2
1 …1 .EA /! ..EA/1 /! …1 EA! 2 :
PA …nC1 .EA/nC1
0 :
Formula (ii) follows from (i) and Theorem 3, because for ˛ 1 one has
Proof Theoretic Analysis by Iterated Reflection 259
Therefore,
It follows that
This paper demonstrates the use of reflection principles for the ordinal analysis of
fragments of Peano arithmetic. More importantly, reflection principles provide a
uniform definition of proof-theoretic ordinals for any arithmetical complexity …0n ,
in particular, for the complexity …01 .
260 L.D. Beklemishev
The results of this paper are further developed in our later paper [7], where the
notion of graded provability algebra is introduced. It provides an abstract algebraic
framework for proof-theoretic analysis and links canonical ordinal notation systems
with certain algebraic models of provability logic. We hope that this further
development will shed additional light on the problem of canonicity of ordinal
notations.
Appendix 1
˛Œx WD max fˇ x W ˇ ˛g
ˇ x ˛ W, .ˇ x ^ ˇ a/:
For technical convenience we also define F1 .x/ D 2x and ˛Œx D 1, if there is no
ˇ x ˛.
Lemma 43 For all ˛; ˇ; x; y,
(i) x y ! F˛ .x/ F˛ .y/I
(ii) ˇ ˛ ! Fˇ .x/ F˛ .x/:
Proof Part (i) is obvious. Part (ii) follows from the fact that
x ˇ ˛ )
x ˛;
.x/
Lemma 44 For all ˛; x, F˛ .x/ D F˛Œx .x/ C 1.
Proof This is obvious for ˛Œx D 1. Otherwise, from Part (i) of the previous
lemma we obtain
.v/ .x/
u; v x ! Fˇ .u/ Fˇ .x/:
.y/
ˇ ˛ ! Fˇ .x/ F˛.y/.x/:
Proof Theoretic Analysis by Iterated Reflection 261
.v/ .x/
Fˇ .u/ F˛Œx .x/;
We now define evaluation trees. An evaluation tree is a finite tree labeled by
tuples of the form h˛; x; yi satisfying the following conditions:
1. If there is no ˇ x ˛, then h˛; x; yi is a terminal node and y D 2xx C 1.
2. Otherwise, there are x immediate successors of h˛; x; yi, and their labels have the
form
and y D yx C 1.
Obviously, the relation x is a code of an evaluation tree is elementary.
Lemma 45
(i) If a node of an evaluation tree is labeled by h˛; x; yi, then F˛ .x/ D y.
(ii) If F˛ .x/ D y, then there is an evaluation tree with the root labeled by h˛; x; yi.
Proof Part (i) is proved by transfinite induction on ˛. If ˛Œx D 1, the statement is
obvious. Otherwise, ˛Œx ˛, hence by the induction hypothesis at the immediate
successor nodes one has F˛Œx .yi / D yiC1 for all i < x (where we also put y0 WD x).
.x/
It follows that yx D F˛Œx .x/, whence y D yx C 1 D F˛ .x/.
Part (ii) obviously follows from the definition of F˛ ,
Now we observe that, for any evaluation tree T, whose root is labeled by h˛; x; yi,
the value max.˛; y/ is a common bound to the following parameters:
(a) each
; u; v such that h
; u; vi occurs in T;
(b) the number of branches at every node of T;
(c) the depth of T.
Ad (a): If h
; u; vi is an immediate successor of h˛; x; yi, then
D ˛Œx x, and
u; v y by the monotonicity of F.
Statement (b) follows from (a) and the fact that the number of branches at a node
h
; u; vi equals u.
Statement (c) follows from the observation that if h
; u; vi is an immediate
successor of h˛; x; yi, then y > v.
An immediate corollary of (a)–(c) is that the code of the evaluation tree T is
bounded by the value g.max.˛; y//, for an elementary function g. Hence we obtain
Proof of Lemma 7 Using Lemma 45 the relation F˛ .x/ D y can be expressed
by formalizing the statement that there is an evaluation tree with the code
262 L.D. Beklemishev
g.max.˛; y//, whose root is labeled by h˛; x; yi. All quantifiers here are bounded,
hence the relation F˛ .x/ D y is elementary.
Inspecting the definition of the relation F˛ .x/ D y notice that the proofs of the
monotonicity properties and bounds on the size of the tree only required elementary
induction (transfinite induction is not used). Hence, these properties together with
the natural defining axioms for F˛ can be verified in EA. This yields a proof of
Lemma 8. Here we just formally state the required properties of F˛ formalizable in
EA.
F1. .8ˇ x :ˇ ˛/ ! ŒF˛ .x/ ' y $ y D 2xx C 1
.v/
F2. F˛ .x/ ' y ^ ˇ x ^ ˇ ˛ ! 9z y 9u; v x Fˇ .u/ ' z
.v/
F3. 8ˇ; u; v x.ˇ ˛ ! 9y z Fˇ .u/ ' y/ ! 9y z F˛ .x/ ' y.
.x/
Here, as usual, Fˇ .x/ ' y abbreviates
Appendix 2
One can roughly classify the existing definitions of proof-theoretic ordinals in two
groups, which I call the definitions “from below” and “from above”. Informally
speaking, proof-theoretic ordinals defined from below measure the strength of the
principles of certain complexity
that are provable in a given theory T. In contrast,
the ordinals defined from above measure the strength of certain characteristic for T
unprovable principles of complexity
. For example, Con.T/ is such a characteristic
principle of complexity …01 .
The standard …11 - and …02 -ordinals are defined from below, and so are the
0
…n -ordinals introduced in this paper. The notorious ordinal of the shortest natural
primitive recursive well-ordering such that TIp:r: . / proves Con.T/ (apart from
the already discussed feature of logical complexity mismatch) is a typical definition
from above.
All the usual definitions of proof-theoretic ordinals can also be reformulated in
the form “from above”. Let a natural elementary well-ordering be fixed. For the case
of …0n -ordinals the corresponding approach would be to let
jTj_ C
…0 WD minf˛ W EA C RFN…n ..EA/˛ / ` RFN…n .T/g:
n
n
(Notice that for n > 1 the theory on the left hand side of ` can be replaced by
.EA/n˛C1 .)
In a similar manner one can transform the definition of the …02 -ordinal via the
Fast Growing hierarchy into a definition “from above”. The class of p.t.c.f. of T has
a natural indexing, e.g., we can take as indices of a function f the pairs he; pi such
Proof Theoretic Analysis by Iterated Reflection 263
that e is the usual Kleene index (= the code of a Turing machine) of f , and p is
the code of a T-proof of the …02 -sentence expressing the totality of the function feg.
With this natural indexing in mind one can write out a formula defining the universal
function 'T .e; x/ for the class of unary functions in F .T/. Then the …02 -sentence
expressing the totality of 'T would be the desired characteristic principle. (It is not
difficult to show that the totality of 'T formalized in this way is EAC -equivalent to
RFN…2 .T/.) The …02 -ordinal of T can then be defined as follows:
jTj_
…0
WD minf˛ W ' 2 F˛C1 g:
2
Notice that the proof-theoretic ordinals of T defined “from above” not only
depend on the externally taken set of theorems of T, but also on the way T is
formalized, that is, essentially on the provability predicate or the proof system for T.
For example, in the above definition the universal function 'T .e; x/ depends on the
Gödel numbering of proofs in T. In practice, for most of the natural(ly formalized)
theories the ordinals defined “from below” and those “from above” coincide:
Proposition 46 If T is EAC -provably …0n -regular and contains EAC , then jTj_
…0n
D
jTj…0n .
Proof Let ˛ D jTj…0n . By provable regularity,
hence EAC C .EA/n˛C1 ` RFN…n .T/. On the other hand, by Gödel’s theorem
The following example demonstrates that, nonetheless, there are reasonable
(and naturally formalized) proof systems for which these ordinals are different, so
sometimes the ordinal defined from above bears essential additional information.
Consider some standard formulation of PA, it has a natural provability predicate
PA . The system PA is obtained from PA by adding Parikh’s inference rule:
PA '
;
'
where ' is any sentence. For the reasons of semantical correctness, Parikh’s rule
is admissible in PA, so PA proves the same theorems as PA. However, as is
well known, the equivalence of the two systems cannot be established within PA
(otherwise, PA would have a speed-up over PA bounded by a p.t.c.f. in PA, which
264 L.D. Beklemishev
was disproved by Parikh [24]).9 Below we analyze the situation from the point of
view of the proof-theoretic ordinals.
Notice that PA is a reasonable proof system, and it has a natural †1 provability
predicate PA . Lindström [21] proves the following relationship between the
provability predicates in PA and PA :
Lemma 47 EA ` 8x .PA .x/ $ 9n PA nPA .Px//; where nPA means n times
iterated PA .
Notation: The right hand side of the equivalence should be understood as the
result of substituting in the external PA the elementary term for the function
n; x: pnPA .Nx/q.
Proof (Sketch) The implication . / holds, because PA is provably closed under
Parikh’s rule, that is,
Proof Statement (i) follows from the EAC -provable …01 -regularity of PA.
An even simpler argument: otherwise one can derive from PA PA ? the formulas
9
PA ? and
PA ?, which yields PA ` PA PA ? ! PA ? contradicting Löb’s theorem.
Proof Theoretic Analysis by Iterated Reflection 265
PA! …1 EA0 !
Appendix 3
Here we discuss the role of the metatheory EAC that was taken as basic in this
paper. On the one hand, it is the simplest choice, and if one is interested in the
analysis of strong systems, there is no reason to worry about it. Yet, if one wants
to get meaningful ordinal assignments for theories not containing EAC , such as
EA C I… 1 or EA C Con.I†1 /, the problem of weakening the metatheory has to
be addressed. For example, somewhat contrary to the intuition, it can be shown (see
below) that EA C Con.I†1 / is not a …01 -regular theory in the usual sense.
These problems can be handled, if one reformulates the hierarchies of iterated
consistency assertions using the notion of cut-free provability and formalizes
Schmerl’s formulas in EA using cut-free conservativity. Over EAC these notions
provably coincide with the usual ones, so they can be considered as reasonable
generalizations of the usual notions in the context of the weak arithmetic EA. The
idea of using cut-free provability predicates for this kind of problems comes from
Wilkie and Paris [39]. Below we briefly sketch this approach and consider some
typical examples.
A formula ' is cut-free provable in a theory T (denoted T `cf '), if there is
a finite set T0 of (closed) axioms of T such that the sequent :T0 ; ' has a cut-free
proof in predicate logic. Similarly, ' is rank k provable, if for some finite T0 T,
the sequent :T0 ; ' has a proof with the ranks of all cut-formulas bounded by k.
266 L.D. Beklemishev
satisfies the EA-provable †1 -completeness and has the usual provability logic—this
essentially follows from the equivalence of the bounded cut-rank and the cut-free
provability predicates in EA.
Visser [38], building on the work H. Friedman and P. Pudlák, established a
remarkable relationship between the predicates of ordinary and cut-free provability:
if T is a finite theory, then10
10
A. Visser works in a relational language and uses efficient numerals, but this does not seem to be
essential for the general result over EA.
Proof Theoretic Analysis by Iterated Reflection 267
Proof We only show the implication .!/, the opposite one is obvious. For any
'.x/ 2 …n the formula T '.Px/ implies cf
T T '.P
cf
x/. Applying RFNcf…n .T/ twice
yields '.x/.
This equivalence carries over to the iterated principles. Let .T/n;cf ˛ denote
RFNcf …n .T/ ˛ . Using the fact that for successor ordinals ˛ the theories .T/ n;cf
˛ are
finitely axiomatizable we obtain by reflexive induction in EA using Proposition 51
for the induction step:
Proposition 52 For any n > 1, provably in EA,
8˛ .T/n;cf
˛ .T/˛ :
n
We say that T is cut-free …n -conservative over U, if for every ' 2 …n , T `cf '
implies U `cf '. Let T cf …n U denote a natural formalization in EA of the mutual
cut-free …n -conservativity of T and U. Externally cf
…n is the same as …n , so the
difference between the two notions only makes sense in formalized contexts.
Analysis of the proof of Schmerl’s formula reveals that we deal with an
elementary transformation of a cut-free derivation into a derivation of a bounded
cut-rank. To see this, the reader is invited to check the ingredient proofs of
Theorem 2 and Propositions 19 and 25. All these elementary proof transformations
are verifiable in EA, which yields the following formalized variant of Schmerl’s
formula (we leave out all the details).
Proposition 53 For all n 1, if T is an elementary presented …nC1 -axiomatized
extension of EA, the following holds provably in EA:
8˛ 1 .T/nCm
˛ …n .T/!m .˛/
cf n
We notice that this relationship holds for the ordinary as well as for the cut-free
reflection principles, because, even for n D 0, the ordinal on the right hand side of
the equivalence is a limit (if m > 0).
Now we consider a few examples. The following proposition shows that the
theory EA C I… 0 0
1 is both …1 -cf-regular and …1 -regular with the ordinal !.
EA C I…
1 …1 Con .EA/! …1 EA! :
cf
Proof The logics of the ordinary and the cut-free provability for EA coincide, so
by the usual proof the schema of local reflection w.r.t. the cut-free provability is
…1 -conservative over Concf .EA/! . But the former contains EA C I… 1 by the
268 L.D. Beklemishev
9n EA `cf cf
EA! ;
nC1
9n EA0 ` cf
EA! : (15)
nC1
On the other hand, we notice that (provably in EA) for every fixed ˇ 0 and
2 …1 ,
Proof Theoretic Analysis by Iterated Reflection 269
EA0 ` cf
EAˇ ! ;
by the cut-free version of the †1 -completeness principle, and applying this to (15)
yields EA0 ` .
Corollary 57 jEA C Con.PA/j…0 D 0 C 1.
1
Acknowledgements The bulk of this paper was written during my stay in 1998–1999 as
Alexander von Humboldt fellow at the Institute for Mathematical Logic of the University of
Münster. Discussions with and encouragements from W. Pohlers, A. Weiermann, W. Burr, and M.
Möllerfeld have very much influenced both the ideological and the technical sides of this work. I
would also like to express my cordial thanks to W. and R. Pohlers, J. and D. Diller, H. Brunstering,
M. Pfeifer, W. Burr, A. Beckmann, B. Blankertz, I. Lepper, and K. Wehmeier for their friendly
support during my stay in Münster.
Supported by Alexander von Humboldt Foundation, INTAS grant 96-753, and RFBR grants
98-01-00282 and 15-01-09218.
References
1. L.D. Beklemishev, Remarks on Magari algebras of PA and I0 C EXP, in Logic and Algebra,
ed. by P. Agliano, A. Ursini (Marcel Dekker, New York, 1996), pp. 317–325
2. L.D. Beklemishev, Induction rules, reflection principles, and provably recursive functions.
Ann. Pure Appl. Log. 85, 193–242 (1997)
3. L.D. Beklemishev, Notes on local reflection principles. Theoria 63(3), 139–146 (1997)
4. L.D. Beklemishev, A proof-theoretic analysis of collection. Arch. Math. Log. 37, 275–296
(1998)
5. L.D. Beklemishev, Parameter free induction and provably total computable functions. Theor.
Comput. Sci. 224(1–2), 13–33 (1999)
6. L.D. Beklemishev, Proof-theoretic analysis by iterated reflection. Arch. Math. Log. 42(6),
515–552 (2003)
7. L.D. Beklemishev, Provability Algebras and Proof-Theoretic Ordinals, I. Ann. Pure Appl. Log.
128, 103–124 (2004)
8. S. Feferman, Arithmetization of metamathematics in a general setting. Fundam. Math. 49,
35–92 (1960)
9. S. Feferman, Transfinite recursive progressions of axiomatic theories. J. Symb. Log. 27, 259–
316 (1962)
10. S. Feferman, Systems of predicative analysis. J. Symbol. Log. 29, 1–30 (1964)
11. S. Feferman, Turing in the land of O (z), in The Universal Turing Machine A Half-Century
Survey, ed. by R. Herken. Series Computerkultur, vol. 2 (Springer, Vienna, 1995), pp. 103–134
12. H. Gaifman, C. Dimitracopoulos, Fragments of Peano’s arithmetic and the MDRP theorem,
in Logic and Algorithmic (Zurich, 1980). Monograph. Enseign. Math., vol. 30 (University of
Genève, Genève, 1982), pp. 187–206
13. S. Goryachev, On interpretability of some extensions of arithmetic. Mat. Zametki 40, 561–572
(1986) [In Russian. English translation in Math. Notes, 40]
14. P. Hájek, P. Pudlák, Metamathematics of First Order Arithmetic (Springer, Berlin, 1993)
15. R. Kaye, J. Paris, C. Dimitracopoulos, On parameter free induction schemas. J. Symb. Log.
53(4), 1082–1097 (1988)
16. G. Kreisel, Ordinal logics and the characterization of informal concepts of proof, in Proceed-
ings of International Congress of Mathematicians (Edinburgh, 1958), pp. 289–299
270 L.D. Beklemishev
17. G. Kreisel, Principles of proof and ordinals implicit in given concepts, in Intuitionism and
Proof Theory, ed. by R.E. Vesley A. Kino, J. Myhill (North-Holland, Amsterdam, 1970), pp.
489–516
18. G. Kreisel, Wie die Beweistheorie zu ihren Ordinalzahlen kam und kommt. Jahresbericht der
Deutschen Mathematiker-Vereinigung 78(4), 177–223 (1977)
19. G. Kreisel, A. Lévy, Reflection principles and their use for establishing the complexity of
axiomatic systems. Zeitschrift f. math. Logik und Grundlagen d. Math. 14, 97–142 (1968)
20. D. Leivant, The optimality of induction as an axiomatization of arithmetic. J. Symb. Log. 48,
182–184 (1983)
21. P. Lindström, The modal logic of Parikh provability. Technical Report, Filosofiska Medde-
landen, Gröna Serien 5, Univ. Göteborg (1994)
22. M. Möllerfeld, Zur Rekursion längs fundierten Relationen und Hauptfolgen (Diplomarbeit,
Institut für Mathematische Logik, Westf. Wilhelms-Universität, Münster, 1996)
23. H. Ono, Reflection principles in fragments of Peano Arithmetic. Zeitschrift f. math. Logik und
Grundlagen d. Math. 33(4), 317–333 (1987)
24. R. Parikh, Existence and feasibility in arithmetic. J. Symb. Log. 36, 494–508 (1971)
25. C. Parsons, On a number-theoretic choice schema and its relation to induction, in Intuitionism
and Proof Theory, ed. by A. Kino, J. Myhill, R.E. Vessley (North Holland, Amsterdam, 1970),
pp. 459–473
26. C. Parsons, On n-quantifier induction. J. Symb. Log. 37(3), 466–482 (1972)
27. W. Pohlers, A short course in ordinal analysis, in Proof Theory, Complexity, Logic, ed. by
A. Axcel, S. Wainer (Oxford University Press, Oxford, 1993), pp. 867–896
28. W. Pohlers, Subsystems of set theory and second order number theory, in Handbook of Proof
Theory, ed. by S.R. Buss (Elsevier/North-Holland, Amsterdam, 1998), pp. 210–335
29. W. Pohlers, Proof Theory. The First Step into Impredicativity (Springer, Berlin, 2009)
30. M. Rathjen, Turing’s “oracle” in proof theory, in Alan Turing: His Work and Impact, ed. by
S.B. Cooper, J. van Leuven (Elsevier, Amsterdam, 2013), pp. 198–201
31. H.E. Rose, Subrecursion: Functions and Hierarchies (Clarendon Press, Oxford, 1984)
32. U.R. Schmerl, A fine structure generated by reflection formulas over primitive recursive
arithmetic, in Logic Colloquium’78, ed. by M. Boffa, D. van Dalen, K. McAloon (North
Holland, Amsterdam, 1979), pp. 335–350
33. K. Schütte, Eine Grenze für die Beweisbarkeit der transfiniten Induktion in der verzweigten
Typenlogik. Arch. Math. Log. 7, 45–60 (1965)
34. K. Schütte, Predicative well-orderings, in Formal Systems and Recursive Functions, ed. by
J.N. Crossley, M.A.E. Dummet. Studies in Logic and the Foundations of Mathematics (North-
Holland, Amsterdam), pp. 280–303 (1965)
35. C. Smoryński, The incompleteness theorems, in Handbook of Mathematical Logic, ed. by
J. Barwise (North Holland, Amsterdam, 1977), pp. 821–865
36. R. Sommer, Transfinite induction within Peano arithmetic. Ann. Pure Appl. Log. 76(3), 231–
289 (1995)
37. A.M. Turing, System of logics based on ordinals. Proc. London Math. Soc. 45(2), 161–228
(1939)
38. A. Visser, Interpretability logic, in Mathematical Logic, ed. by P.P. Petkov (Plenum Press,
New York, 1990), pp. 175–208
39. A. Wilkie, J. Paris, On the scheme of induction for bounded arithmetic formulas. Ann. Pure
Appl. Log. 35, 261–302 (1987)
Part III
Philosophical Reflections
Alan Turing and the Foundation of Computer
Science
Juraj Hromkovič
J. Hromkovič ()
Department of Computer Science, ETH Zurich, Zurich, Switzerland
e-mail: juraj.hromkovic@inf.ethz.ch
This seems to be same as when one observes the development of any natural
language. What is the difference? The current language of mathematics is exact in
the following sense. It is so accurate that it cannot be more accurate if one accepts
its axioms that are the definitions of the basic notions. If one introduces a new word
(notion) to the language of mathematics, then one must provide an exact definition
of the meaning of this word. The definition has to be so precise that everybody
understanding the language of mathematics interprets the meaning of the defined
notion in exactly the same way and can unambiguously decide for every object
whether it satisfies this definition or not. To get this high degree of precision was a
long process which already started with the birth of mathematics. One often forgets
that the search for the precise meaning of basic terms of fundamental concepts is at
least as important as proving new theorems and that, in several cases, this took much
more effort during periods of several 100 years. Mathematics became a language
that avoids any kind of misinterpretation if one understands it properly. Because of
that, one can measure a crucial characteristic of the development stage of a scientific
discipline by the degree of the ability to use the language of mathematics in its
research and in expressing its discoveries.
For instance, numbers as an abstraction were used to express and compare
sizes and amounts for all kinds of measurements. The concept of number and
calculating with numbers was considered so important that the Pythagoreans even
believed that the whole world and all its principles could be explained by using
natural numbers only. Their p dream was definitely broken when one geometrically
constructed the number 2 and showed that it cannot be expressed in any
finite arithmetic calculation over the natural numbers (cannot be expressed as an
arithmetic formula over the natural numbers). Equations were discovered in order to
express relationships between different sizes in order to be able to estimate unknown
sizes without measurements by calculating them from the known sizes. Euclidean
geometry was developed in order to describe the three-dimensional space in which
we live and move, and geometry became crucial for the measurement of areas, for
architecture, for building houses, bridges, monuments etc. Each new concept in
mathematics offered a new notion (word) for the language of mathematics. And any
new word increased the expressive power of mathematics and made mathematics
more powerful as a set of instruments for discovering our world.
The crucial point in the development of mathematics was and is its trustability
which is absolute if one accepts the axioms of mathematics. If one is able to express
the reality correctly in the language of mathematics or in a “mathematical model,”
then all products of calculations and of the research work on this model inside
of mathematics offer true statements about the modeled reality. This unlimited
trustability makes mathematics so valuable and, in fact, irreplaceable for science.
Leibniz [7, 8] was the first who formulated the role of mathematics in the
following way. Mathematics offers an instrument to automatize the intellectual work
of humans. One expresses a part of reality as a mathematical model, and then one
calculates using arithmetics. One does not need to argue anymore that each step of
the calculation correctly represents the objects or relationships under investigation.
One simply calculates and takes the result of this calculation as a truth about the
Alan Turing and the Foundation of Computer Science 275
After logic as a formal system of reasoning was developed, it became clear that
each proof (formal argumentation) in mathematics is verifiable and this verification
process can even be automatized. Let us explain this more carefully in what follows.
The end of the nineteenth century and the beginning of the twentieth century were
the time of the industrial revolution. Engineers automatized a big portion of human
physical work and made humans partially free from monotone physical activities,
and so more free for intellectual growth. Life improved a lot during this time,
and not only society, but also scientists became very optimistic about what can be
reached and automatized. One of the leading experts in mathematics at that time was
Hilbert [4]. He believed that searching for a solution can be automatized for any kind
of problem. His dream was that the hardest and most intellectual part of the work of
mathematicians, namely creating proofs of theorems, can be automatized. But the
notion of “automation” developed in the meantime had a more precise meaning than
in the dream of Leibniz. Let us describe this more carefully.
We start with the notion of a problem. A problem is a collection of (usually
infinitely many) problem instances that have something in common. This something
276 J. Hromkovič
Following the statement of the Incompleteness Theorem, it became clear that there
does not exist any method for proving all valid mathematical claims, because, for
some claims, no proofs exist. But the result of Gödel left a fundamental problem
open. Does there exist a method for calculating proofs of provable claims? Do
there exist methods for solving concrete problems for which up to now no method
was discovered? At this point, we have to repeat that a “mathematical” problem
is a collection of infinitely many instances of this problem. To provide a solution
method means to find a finite description how to proceed in order to solve any of the
infinitely many problem instances. In other words, discovering a method for solving
a problem means to be able to reduce the infinite diversity of this problem to a finite
size given by the description of the method.
Alan Turing wanted to show that there exist problems whose infinite diversity
given by their infinitely many problem instances cannot be reduced to any finite
size. But the question is: “How to prove a result like that? How to prove the non-
existence of a solution method (an algorithm) for a concrete problem?” Up to that
time, nobody needed a formal, mathematical definition of the notion of a “method.”
A good, intuitive understanding of the term “method” for a problem as a well-
understandable description how to work in order to solve any instance of the given
problem was sufficient. However, in order to try to prove the non-existence of a
Alan Turing and the Foundation of Computer Science 279
method for a problem, one has to know exactly what a method is. You cannot prove
that something does not exist, if you do not know precisely what this something
is. Therefore, there was the need to offer an exact, mathematical definition of the
notion of a method. This means to transfer our intuition about the meaning of the
word “method” into an exact description in the language of mathematics. And this
is nothing else than to try to extend the current mathematics by a new axiom.
To formalize the meaning of the word “mathematical method” (called an
algorithm by Hilbert) was the main contribution of Alan Turing. For this new
definition, one prefers to use the new word “algorithm” instead of method in order
to distinguish it from the broad use of the term “method” in our natural language.
With the new concept of algorithm included in the language of mathematics, one is
able to investigate the limits of automation. If a problem admits an algorithm, then
solving the problem can be automatized. If there does not exist any algorithm for the
given problem, then the problem is considered to be not automatically solvable. This
new axiom of mathematics enabled to prove, for many problems, that they are not
algorithmically solvable, and a very successful investigation of the border between
algorithmic solvability and algorithmic unsolvability was initiated. An important
consequence of the study of the limits of algorithmic solvability (automation) is that
it is definitely not sufficient to be able to express a real-world problem in terms
of mathematics. If this problem is not algorithmically solvable, then it may be an
intellectual challenge to try to solve a particular instance of this problem.
To create an axiom is usually a long process and to get a new axiom accepted
may take even more time. As already mentioned above, for the axiomatic definition
of infinity, people needed 2000 years, for the concept of probability approximately
300 years, for correct reasoning in the form of a formal logic more than 2000 years,
and we do not know how many years were needed for an axiomatic definition
of geometry. The 6 years that started in 1930 with the Incompleteness Theorem
of Gödel and finished in 1936 [9] with the presentation of the so-called “Turing
Machine” is in fact a very short time. More time was needed to accept Turing’s
definition of an algorithm. This was done by trying to define the notion of an
algorithm by many different formal approaches. The experience that all these
approaches resulted in the same notion of algorithmic solvability (in the same limit
on automation) led to the famous “Church-Turing Thesis” stating that all these
formal definitions of an algorithm correspond to our intuition about the meaning
of this notion.
What was the difficulty in formalizing the meaning of the word “method”
(“algorithm”)? One can always say that a description of a method must consist of
simple actions for which nobody has doubt about that they can be simply executed.
For instance, arithmetic operations or fundamental logical operations are definitely
of this kind. But the question is whether we have already a set of instructions
(activities) that is complete in the sense that each algorithm (method) can be
expressed in terms of these instructions. If one misses a fundamental instruction
that is not expressible by the instructions in our list, then all algorithms that need this
instruction would not be considered as algorithms. In this way, some algorithmically
solvable problems would be considered to be unsolvable and our concept of
280 J. Hromkovič
Since wn is the word with the first proof in the canonical order of the fact
“K.wn / n,” for each n, An generates wn .
We observe that all algorithms An are almost the same for all n, they only differ
in the input n that can be written using dlog2 .n C 1/e bits. The length of the rest can
be bounded by c bits and c is independent of n. Hence, we proved
holds for all n. But this can be true only for finitely many n. The conclusion is
that there exists an n0 such that, for all n n0 , there does not exist any proof that
K.x/ n for any word x.
5 Conclusion
Alan Turing enriched mathematics by a new axiom that increased the expressive
power as well as the argumentation power of mathematics. Introducing the concept
of algorithms enabled to study the limits of automation of intellectual work. In
this way, a new scientific discipline, called computer science or informatics, was
founded.
References
1. G.J. Chaitin. On the length of programs for computing finite binary sequences. J. Assoc.
Comput. Mach. 13(4), 547–569 (1966)
2. G.J. Chaitin, On the length of programs for computing finite binary sequences, statistical
considerations. J. Assoc. Comput. Mach. 16(1), 145–159 (1969)
3. K. Gödel, Collected Works, vol. 1–5 (Oxford University Press, Oxford, 1986/2003)
4. D. Hilbert, Die logischen Grundlagen der Mathematik. Math. Ann. 88, 151–165 (1923)
5. J. Hromkovič, Theoretical Computer Science (Springer, New York, 2004)
6. A.N. Kolmogorov, Three approaches to the definition of the concept “quantity of information.
Problemy Peredachi Informatsii 1(1), 3–11 (1965). Russian Academy of Sciences
7. G.W. Leibniz, Discourse on Metaphysics, Principles on Nature and Grace. The Monadology
(1686)
8. G.W. Leibniz, Philosophical Essays, edited and translated by R. Ariew and D. Garber (Hackett
Publishing Company, Indianapolis, 1989)
9. A.M. Turing, On computable numbers, with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. 42(2), 230–265 (Oxford University Press, Oxford, 1936)
Proving Things About the Informal
Stewart Shapiro
Abstract There has been a bit of discussion, of late, as to whether Church’s thesis is
subject to proof. The purpose of this article is to help clarify the issue by discussing
just what it is to be a proof of something, and the relationship of the intuitive
notion of proof to its more formal explications. A central item on the agenda is
the epistemological role of proofs.
For a long time, it was widely, if not universally, believed that Church’s thesis is
not subject to rigorous mathematical demonstration or refutation. This goes back to
Alonzo Church’s [1] original proposal:
We now define the notion . . . of an effectively calculable function of positive integers by
identifying it with the notion of a recursive function of positive integers . . . This definition
is thought to be justified by the considerations which follow, so far as positive justification
can ever be obtained for the selection of a formal definition to correspond to an intuitive
one. (Church [1, §7])
The argument here is simple. Since Church’s thesis is the identification of an intu-
itive notion (effective computability) with a precisely defined one (recursiveness)
there is no sense to be made of establishing, mathematically, that the identification,
or “definition” is correct.
S. Shapiro ()
The Ohio State University, Columbus, OH, USA
e-mail: shapiro.4@osu.edu
The founders, Church, Stephen Cole Kleene, Alan Turing, et al., were thinking of
computability in terms of what can be done by a suitably idealized human following
an algorithm (see, for example, [2]).1 So the notion of computability seems to invoke
psychological or other spatio-temporal notions. Emil Post [3] was explicit about this
. . . for full generality a complete analysis would have to be made of all the possible ways
in which the human mind could set up finite processes.
. . . we have to do with a certain activity of the human mind as situated in the universe. As
activity, this logico-mathematical process has certain temporal properties; as situated in
the universe it has certain spatial properties.
So the theme underlying Church’s claim is that there just is no hope of rigorously
proving things when we are dealing with the messy world of space, time, and human
psychology.
In a retrospective article, Kleene [4, pp. 317–319] broached another, related matter:
Since our original notion of effective calculability of a function (or of effective decidability
of a predicate) is a somewhat vague intuitive one, [Church’s thesis] cannot be proved . . .
While we cannot prove Church’s thesis , since its role is to delimit precisely an hitherto
vaguely conceived totality, we require evidence that it cannot conflict with the intuitive
notion which it is supposed to complete; i.e., we require evidence that every particular
function which our intuitive notion would authenticate as effectively calculable is . . .
recursive. The thesis may be considered a hypothesis about the intuitive notion of
effective calculability, or a mathematical definition of effective calculability; in the latter
case, the evidence is required to give the theory based on the definition the intended
significance.
1
Some later writers invoke a different notion of computability, which relates to mechanical
computing devices. Prima facie, this latter notion is an idealization from a physical notion. So,
for present purposes, the issues are similar.
Proving Things About the Informal 285
it must be false, strictly speaking. Vague things cannot match up, precisely, with
precise ones.
One can, of course, sharpen vague properties, for various purposes. Kleene
speaks of “completing” a vague notion, presumably replacing it with a sharp one.
As he notes, one can sometimes justify the chosen sharpening, or completion, at
least partially, but there is no question of proving, with mathematical resources, that
a given sharpening is the one and only correct one.
To make this case, however, we would need an argument that computability is
vague. Are there any borderline computable functions? Can we construct a sorites
series from a clearly computable function to a clearly non-computable function? I
find it hard to imagine what such a series would look like.
In conversations with Hao Wang, Kurt Gödel challenged the conclusion based on
the supposed vagueness of computability. Gödel invoked his famous (or infamous)
analogy between mathematical intuition and sense perception:
If we begin with a vague intuitive concept, how can we find a sharp concept to correspond
to it faithfully? The answer is that the sharp concept is there all along, only we did not
perceive it clearly at first. This is similar to our perception of an animal first far away and
then nearby. We had not perceived the sharp concept of mechanical procedures before
Turing, who brought us to the right perspective. And then we do perceive clearly the
sharp concept.
If there is nothing sharp to begin with, it is hard to understand how, in many cases, a
vague concept can uniquely determine a sharp one without even the slightest freedom
of choice.
“Trying to see (i.e., understand) a concept more clearly” is the correct way of expressing
the phenomenon vaguely described as “examining what we mean by a word” (quoted in
Wang [5, pp. 232 and 233], see also Wang [6, pp. 84–85])
Gödel’s claim seems to be that there is a precise property that somehow underlies
the intuitive notion of computability. One way to put it is that computability only
appears to be vague. The subsequent analysis of the intuitive notion convinced us,
or at least convinced him, that there is a unique, sharp notion that underlies the
intuitive one.
It is not clear how much “freedom of choice”, to use Gödel’s term, there was
concerning the development of computability—in the sharpening of (what appears
to be) a vague, intuitive notion. There are, after all, other notions of computability:
finite state computability, push-down computability, computability in polynomial
time, etc. For that matter, it is not clear what it is to have “freedom of choice” when
examining certain concepts—and possibly sharpening them.
Our main question stands. When put in Gödelian terms, how does one prove that
a proposed sharpening of an apparently vague notion is the uniquely correct one—
that it captures the precise concept that was “there all along”. Does mathematics
extend that far? How?
286 S. Shapiro
3 Theses
Even from the beginning, it was only a near consensus that Church’s thesis is not
subject to mathematical demonstration. In a letter to Kleene, Church noted that
Gödel opposed the prevailing views on this matter2 :
In regard to Gödel and the notions of recursiveness and effective calculability, the history is
the following. In discussion with him . . . it developed that there is no good definition of
effective calculability. My proposal that lambda-definability be taken as a definition of
it he regarded as thoroughly unsatisfactory . . .
His [Gödel’s] only idea at the time was that it might be possible, in terms of effective
calculability as an undefined term, to state a set of axioms which would embody the
generally accepted properties of this notion, and to do something on that basis.
It is not easy to interpret these second-hand remarks, but Gödel seems to have
demanded a proof, or something like a proof, of Church’s thesis or of some related
2
The letter, which was dated November 29, 1935, is quoted in Davis [12, p. 9].
Proving Things About the Informal 287
thesis, contrary to the emerging consensus. His proposal seems to be that one begin
with a conceptual analysis of computability. This might suggest axioms for the
notion, axioms one might see to be self-evident. Then, perhaps, one could rigorously
prove that every computable function is recursive, and vice versa, on the basis
of those axioms. Something along these lines might count as a demonstration, or
proof, of Church’s thesis. As indicated in the passage from the Wang conversations,
Gödel may have held that something similar underlies, or could underlie, the other
“theses”, concerning velocity and area.
Gödel himself was soon convinced that (something equivalent to) Church’s thesis
is true. Turing’s landmark paper was a key event. In a 1946 address (published in
Davis [7, pp. 84–88]), Gödel remarked that
Tarski has stressed in his lecture (and I think justly) the great importance of the concept
of general recursiveness (or Turing’s computability). It seems to me that this importance is
largely due to the fact that with this concept one has for the first time succeeded in giving
an absolute definition of an interesting epistemological notion . . . (Gödel [8, p. 84])
Although it is not clear whether Gödel regarded Turing’s original [9] treatment of
computability as constituting, or perhaps suggesting, or leading to, a rigorous proof
of Church’s thesis—perhaps along the foregoing lines—it seems that Gödel was
satisfied with Turing’s treatment (see Kleene [10, 11], Davis [12], and my review
of those, Shapiro [13]), and convinced that Turing computability is the uniquely
correct explication of computability. In a 1964 postscript to Gödel [14], he wrote that
“Turing’s work gives an analysis of the concept of ‘mechanical procedure’ (alias
‘algorithm’ or ‘computation procedure’ or ‘finite combinatorial procedure’). This
concept is shown to be equivalent to that of a Turing machine”. The word “shown”
here might be read as “proved”, but perhaps we should not be so meticulous in
interpreting this remark.
In any case, the once near consensus that Church’s thesis is not subject to math-
ematical proof or refutation was eventually challenged by prominent philosophers,
logicians, and historians. Robin Gandy [15] and Elliott Mendelson [16] claimed that
Church’s thesis is susceptible of rigorous, mathematical proof; and Gandy went so
far as to claim that Church’s thesis has in fact been proved. He cites Turing’s [9]
study of a human following an algorithm as the germ of the proof. Gandy referred
to (a version of) Church’s thesis as “Turing’s theorem”. Wilfried Sieg [17] reached a
more guarded, but similar conclusion: “Turing’s theorem” is the proposition that if f
is a number-theoretic function that can be computed by a being (or device) satisfying
certain determinacy and finiteness conditions, then f can be computed by a Turing
machine. Turing’s original paper [9] contains arguments that humans satisfy some
these conditions, but, apparently, Sieg [17] considered this part of Turing’s text to
be less than a rigorous proof. Sieg [18, 19] claims to have completed the project of
rigorously demonstrating the equivalence of computability and recursiveness, via a
painstaking analysis of the notions involved:
My strategy . . . is to bypass theses altogether and avoid the fruitless discussion of their
(un-)provability. This can be achieved by conceptual analysis, i.e., by sharpening the
288 S. Shapiro
informal notion, formulating its general features axiomatically, and investigating the
axiomatic framework . . .
The detailed conceptual analysis of effective calculability yields rigorous characterizations
that dispense with theses, reveal human and machine calculability as axiomatically given
mathematical concepts, and allow their systematic reduction to Turing computability.
(Sieg [18, pp. 390 and 391])
Sieg’s use of the term “sharpening” raises the aforementioned matter of vague-
ness. It is indeed intriguing that a conceptual analysis, followed by a rigorous
demonstration, can somehow clear away vagueness, showing—with full mathemat-
ical rigor—that there was a unique concept there all along, as Gödel contended.
The Gödel/Mendelson/Gandy/Sieg position(s) generated responses (e.g., Folina
[20], Black [21]), and the debate continues. As the foregoing sketch indicates,
the issues engage some of the most fundamental questions in the philosophy of
mathematics. What is it to prove something? What counts as a proof? For that
matter, what is mathematics about?
Let us turn to some prominent formal explications of the notion of proof. Consider-
ing those will, I think, shed light on our question concerning Church’s thesis.
From some perspectives, a proof is a sequence of statements that can be
“translated”, at least in principle, into a derivation in Zermelo-Fraenkel set theory
(ZF, or ZFC). Call this a ZFC-proof. The staunch naturalist Penelope Maddy [22, p.
26] sums up this foundational role for set theory: “if you want to know if a given
statement is provable or disprovable, you mean (ultimately), from the axioms of the
theory of sets.”
With this proposed explication in mind, we’d begin with a series of statements
in ordinary natural language, supplemented with mathematical terminology—say
the contents of the compelling Sieg analysis. We’d translate the statements into the
language of ZFC—with membership being the only non-logical term—and show
that the translation of our conclusion, Church’s thesis, follows from the axioms of
ZFC.
The issues would then focus on the translation from the informal language into
that of set theory. What should be preserved by a decent translation from an informal
mathematical text into the language of set theory? Despite the use of the word
“translation”, it is surely too much to demand that the ZFC-version has the same
meaning as the original text. How can we capture the meaning of, say, a statement
about bounded mechanical procedures, by using only the membership symbol?
Presumably, the central mathematical and logical relations should be preserved in
the translation. But what are those? Can we leave this matter at an intuitive level,
claiming that we know a good “translation” when we see it? Surely, that would beg
all of the questions.
Proving Things About the Informal 289
Set theory provides a target for embedding just about any mathematical domain.
Moschovakis continues:
. . . we . . . discover within the universe of sets faithful representations of all the
mathematical objects we need, and we will study set theory . . . as if all mathematical
objects were sets. The delicate problem in specific cases is to formulate precisely the
correct definition of “faithful representation” and to prove that one such exists.
I propose to just take it for granted, for the sake of argument, that we do have a
good explication of suitably idealized deductive discourse. So let us just assume that
we do know what it is for a premise-conclusion pair to be valid, and that translation
into one of the standard formal deductive systems matches that. But, of course,
not all valid deductive argumentations constitute proofs. Which discourses establish
their conclusions with full mathematical rigor? With our proposed explication, our
question becomes which formal derivations, in decent deductive systems, constitute
or correspond to proofs?
A common view is that a given formal derivation constitutes a proof only if all
of the (undischarged) premises are, or correspond to, or are translations of, either
self-evident or previously established propositions.3 Call such a derivation a formal
proof.
Presumably, a formal proof of Church’s thesis would begin with a standard
formalization of number-theory. One would add a predicate for computability, which
applies to number-theoretic functions (or sets of natural numbers), to the formal
language. And one would supply some axioms for this new predicate.
Present action would then focus on these purported new axioms. They should
either be unquestionably true—self-evident—or previously established. Then we
would produce a sequence of formulas, each of which would either be an axiom
(of number theory or of computability), or would follow from previous lines by the
unquestionably valid, formally correct rules of the deductive system. The last line
would be Church’s thesis.
This model seems to fit Gödel’s early suggestion to Church that logicians should
try to “state a set of axioms which would embody the generally accepted properties
of [effective calculability] and . . . do something on that basis”. The axioms should
constitute intuitively evident truths about computability; the “something” that we
would do “on that basis” would be to produce a formalized, or at least formalizable
derivation. The model also seems to fit the deep analysis in Sieg [18, 19]. Arguably,
Sieg’s conceptual analysis does give us evident truths about computability, perhaps
even self -evident truths. It would be no harder to formalize Sieg’s text than it would
any sufficiently informal mathematical argument.
On the present explication, whether an unformalized text constitutes a proof
presumably depends on how close the text is to a formalization, and how plausible
the resulting formalized axioms and premises are, with respect to the intuitive
notions invoked in it. This perspective might allow one to think of Turing’s text as
a formalizable proof—or the germ of one—despite the fact that some people were
not convinced, and despite the fact that most mathematicians did not, and many
still do not, recognize it as a proof or a proof-germ. The uncertainty is charged to
the connection between an informal text in ordinary language and a formalization
thereof.
3
Clearly, this is not a sufficient condition. A one or two line argument whose sole premise and
conclusion is a previously established proposition does not constitute a proof of that proposition.
Proving Things About the Informal 291
This example, and thousands of others like it, raises a general question of how
we tell whether a given piece of live mathematical reasoning corresponds to a
given actual or envisioned formal proof. Clearly, there can be some slippage here:
such is informal reasoning, and such is vagueness. How does one guarantee that
the stated axioms or premises of the formal proof are in fact necessary for the
intuitive, pre-theoretic notions invoked in the informal text? That is, how do we
assure ourselves that the formalization is faithful? This question cannot be settled
by a formal derivation. That would start a regress, at least potentially. We would
push our problem to the axioms or premises of that derivation. Moreover, any
formalization of a real-life argument reveals missing steps, or gaps, and plugging
those would sometimes require theses much like Church’s thesis.
In a collection of notes entitled “What does a mathematical proof prove?”
(published posthumously in [24, pp. 61–69]), Imre Lakatos makes a distinction
between the pre-formal development, the formal development, and the post-formal
development of a branch of mathematics. Present concern is with the last of these.
Lakatos observes that even after a branch of mathematics has been successfully
formalized, there are residual questions concerning the relationship between the
formal deductive system (or the definitions in ZFC) and the original, intuitive
mathematical ideas. How can we be sure that the formal system accurately reflects
the original mathematical ideas? These questions cannot be settled with a derivation
in a further formal deductive system, not without begging the question or starting a
regress—there would be new, but similar, questions concerning the new deductive
system.
I take it as safe to claim that if there is to be a mathematical proof of Church’s
thesis, the proof cannot be fully captured with either a ZFC-proof or a formal
proof or a ZFC-proof. There would be an intuitive or perhaps philosophical residue
concerning the adequacy of the formal or ZFC surrogates to the original, pre-
theoretic notion of computability. So if one identifies mathematical proof with
formal proof or with ZFC-proof, then one can invoke modus tolens and accept the
conclusion that Church’s thesis is not subject to mathematical proof. But then there
would not have been many proofs before the advent of formal deductive systems or
the codification of ZFC. It is more natural, however, to hold that (1) Church’s thesis
is not entirely a formal or ZFC matter, but (2) this does not make Church’s thesis
any less mathematical, nor does it rule out a proof of Church’s thesis.
6 Epistemology
The issue would turn on the status of either the set-theoretic definitions or the
undischarged premises of the formal (or formalizable) derivation. What epistemic
property must they have for the discourse to constitute a proof of its conclusion?
Clearly, we cannot prove everything. What is required of the principles that, in any
given derivation, are not proved?
292 S. Shapiro
There is a longstanding view that results from mathematics are absolutely certain.
For some theorists, this is tacit; for others it is quite explicit. The idea is that for
mathematics at least, humans are capable of proceeding, and should proceed, by
applying infallible methods. We begin with self-evident truths, called “axioms”,
and proceed by self-evidently valid steps, to all of our mathematical knowledge.
In practice (or performance) we invariably fall short of this, due to slips of the pen,
faulty memory, and the practical need to take shortcuts, but in some sense we are
indeed capable of error-free mathematics, and, moreover, all of the mathematics
we have, or at least all of the correct mathematical we have, could be reproduced,
in principle, via rigorous deductions from self-evident axioms, either in ZFC or in
some other formalized theory.
Lakatos [24] calls this the “Euclidean” model for mathematical knowledge,
but it might be better to steer away from any exegetical assumptions concerning
the historical Euclid. Call it the foundationalist model, since it fits the mold of
foundationalist epistemology. Indeed, mathematics, so conceived, is probably the
model for foundationalism in epistemology, and for traditional rationalism.
In the Gibbs lecture, Gödel [25, p. 305] comes close to endorsing a foundation-
alist model for mathematics:
[The incompletability of mathematics] is encountered in its simplest form when the
axiomatic method is applied, not to some hypothetico-deductive system such as geometry
(where the mathematician can assert only the conditional truth of the theorems), but to
mathematics proper, that is, to the body of those mathematical propositions which hold in an
absolute sense, without any further hypothesis . . . [T]he task of axiomatizing mathematics
proper differs from the usual conception of axiomatics insofar as the axioms are not
arbitrary, but must be correct mathematical propositions, and moreover, evident without
proof.
Roger Penrose [26, Chap. 3] also seems endorse something in the neighborhood
of the foundationalist model, with his central notion of “unassailable knowledge”.
He admits that even ideal subjects may be subject to error, but he insists that errors
are “correctable”, at least for anything we might call mathematics.
Although, as noted, the foundationalist model has advocates today, it is under
serious challenge, partly due to the actual development of mathematics (see Shapiro
[27]). Nowadays, the standard way to establish new theorems about, say, the natural
numbers, is not to derive them from self-evident truths about natural numbers—
such as the first-order Peano postulates—but rather to embed the natural numbers
in a richer structure, and take advantage of the more extensive expressive and
deductive resources. The spectacular resolution of Fermat’s last theorem, via
elliptical function theory, is not atypical in this respect.
I submit that as the embedding structures get more and more complex and
powerful—as our ability to postulate gets more and more bold—we eventually lose
our “unassailable” confidence that the background theories are themselves based on
self-evident axioms (if we ever had such confidence). Even the consistency of the
powerful theories can come into question. This seems to compromise the ideal of
proceeding by self-evidently valid steps.
Proving Things About the Informal 293
4
Boolos [28] is a delightful challenge to some of the more counter-intuitive consequences of ZFC,
suggesting that we need not believe everything the theory tells us. If anything even remotely close
to the conclusions of that paper is correct, then, surely, the axioms of ZFC fall short of the ideal of
self-evident certainty.
294 S. Shapiro
concept of set. The history of our subject shows otherwise, assuming of course
that at least some of the original skeptics were sufficiently rational. As Friedrich
Waismann [29] once put it, “Mathematicians at first distrusting the new ideas . . .
then got used to them”. The axioms of set theory are accepted—with hindsight—in
large part because of the fruit that the theory produced in organizing and extending
mathematical knowledge. I’d go so far as to say that the axioms are evident only
with hindsight.
Of course, this is not the place to settle broad questions concerning mathematical
knowledge, but the general issues are centrally relevant to the matter at hand,
whether it makes sense to speak of a proof of Church’s thesis, and similar theses. I
see no reason to withhold the honorific of “proof” or even “rigorous proof” to, say,
the insightful development in Sieg [18, 19]. As noted, it is no more than an exercise
to formalize the deductive aspects of the text. And the undischarged premises—
say the principles of determinateness and finitude—are sufficiently evident, at least
at this point in history. What makes them evident, however, is not that they are
self -evident, the sort of thing that any rational thinker should accept upon properly
grasping the concept of computability. As with set theory, the early history of
Church’s thesis makes this hard to maintain. What makes the premises compelling
is the result of seventy-five years of sharpening, developing, and, most important,
using recursive function theory, Turing computability, and the like in our theorizing.
This is not to say that the proof begs any questions, although some of the
early critics of Church’s thesis would certainly see it that way. So would any
contemporary thinker who doubts the truth of Church’s thesis. That is the way
of deductive reasoning. If one wants to reject the conclusion of a valid argument,
one will find a premise or premises to reject, possibly claiming that the question
is begged. In general, and in some sense yet to be specified, when it comes to
deduction, one does not get out any more than one puts in.
I presume that contemporary defenders of the original claim that Church’s thesis
is not susceptible of proof would also claim that the question is begged in the
Sieg text. Otherwise, they would have to concede defeat (assuming that Sieg’s
argumentation is valid). The matter of whether a given argument begs the question
is itself context-sensitive, and is often indeterminate. It depends on what various
interlocutors are willing to take for granted.
I admit that, in the present intellectual climate, it is unlikely that there is anyone
who was convinced of the truth of Church’s thesis through, say, Sieg’s analysis and
argumentation—whatever its status. That is, I find it unlikely that there are theorists
who were, at first, neutral or skeptical of Church’s thesis, and were won over by
the Sieg texts. But I take this to be more a statement about the present intellectual
climate, and its history, than of the status of the argumentation. It is reasonably clear,
nowadays, what an author can take her interlocutors to grant concerning the notions
of computation, algorithm, and the like.
I submit, however, that Sieg’s discourse and, for that matter, Turing’s are no
more question-begging than any other deductive argumentation. The present theme
is more to highlight the holistic elements that go into the choice of premises, both
Proving Things About the Informal 295
in deductions generally and in discourses that are rightly called “proofs”, at least in
certain intellectual contexts.
Acknowledgments This paper is a sequel to some early sections in Shapiro [30] and Shapiro
[31]. Thanks to the audience at the Conference on Church’s thesis, Trends in Logic conference, in
Krakow, Poland, in 2011, and the Conference on Turing, held in Zurich, in 2012.
References
1. A. Church, An unsolvable problem of elementary number theory. Am. J. Math. 58, 345–363
(1936); reprinted in [7], pp. 89–107
2. J. Copeland, The Church-Turing thesis. Stanf. Encycl. Philos. (1997) http://plato.stanford.edu/
entries/church-turing/
3. E. Post, Absolutely unsolvable problems and relatively undecidable propositions, in [7], pp.
338–433
4. S. Kleene, Introduction to Metamathematics (Amsterdam, North Holland, 1952)
5. H. Wang, A Logical Journey: From Gödel to Philosophy (The MIT Press, Cambridge, MA,
1996)
6. H. Wang, From Mathematics to Philosophy (Routledge and Kegan Paul, London, 1974)
7. M. Davis, The Undecidable (The Raven Press, Hewlett, NY, 1965)
8. K. Gödel, Remarks before the Princeton bicentennial conference on problems in mathematics
(1946); [7], pp. 84–88
9. A. Turing, On computable numbers, with an application to the Entscheidungsproblem Proc.
Lond. Math. Soc. 42, 230–265 (1936) reprinted in [7], pp. 116–153
10. S. Kleene, Origins of recursive function theory. Ann. Hist. Comput. 3(1), 52–67 (1981)
11. S. Kleene, Reflections on Church’s thesis. Notre Dame J. Formal Log. 28, 490–498 (1987)
12. M. Davis, Why Gödel didn’t have Church’s thesis. Inf. Control Academic Press 54, 3–24
(1982)
13. S. Shapiro, Review of Kleene (1981), Davis (1982), and Kleene (1987). J. Symb. Log. 55,
348–350 (1990)
14. K. Gödel, On undecidable propositions of formal mathematical systems (1934); [7], pp. 39–74
15. R. Gandy, The confluence of ideas in 1936, in The Universal Turing Machine, ed. by R. Herken
(Oxford University Press, New York, 1988), pp. 55–111
16. E. Mendelson, Second thoughts about Church’s thesis and mathematical proofs. J. Philos. 87,
225–233 (1990)
17. W. Sieg, Mechanical procedures and mathematical experience, in Mathematics and Mind, ed.
by A. George (Oxford University Press, Oxford, 1994), pp. 71–140
18. W. Sieg, Calculations by man and machine: conceptual analysis, in Reflections on the
Foundations of Mathematics: Essays in Honor of Solomon Feferman, in ed. by W. Sieg, R.
Sommer, C. Talcott, (Association for Symbolic Logic, A. K. Peters, Ltd., Natick, MA, 2002),
pp. 390–409
19. W. Sieg, Calculations by man and machine: mathematical presentation, in In the Scope of
Logic, Methodology and Philosophy of Science 1, ed. by P. Gärdenfors, J Woleński, K. Kijania-
Placek (Kluwer Academic, Dordrecht, 2002), pp. 247–262
20. J. Folina, Church’s thesis: prelude to a proof. Philos. Math. 6 (3), 302–323 (1998)
21. R. Black, Proving Church’s thesis. Philo. Math. (III) 8, 244–258 (2000)
22. P. Maddy, Naturalism in Mathematics (Oxford University Press, Oxford, 1997)
23. Y. Moschovakis, Notes on Set Theory (Springer, New York, 1994)
24. I. Lakatos, Mathematics, Science and Epistemology, ed. by J. Worrall, G. Currie (Cambridge
University Press, Cambridge, 1978)
296 S. Shapiro
25. K. Gödel, Some basic theorems on the foundations of mathematics and their implications, in
Collected Works 3 (Oxford University Press, Oxford, 1951, 1995), pp. 304–323
26. R. Penrose, Shadows of the Mind: A Search for the Missing Science of Consciousness (Oxford
University Press, Oxford, 1994)
27. S. Shapiro, We hold these truths to be self evident: but what do we mean by that? Rev. Symb.
Log. 2, 175–207 (2009)
28. G. Boolos, Must we believe in set theory? in Logic, Logic, and Logic, ed, by G. Boolos
(Harvard University Press, Cambridge, MA, 1998), pp. 120–132
29. F. Waismann, Lectures on the Philosophy of Mathematics, edited and with an Introduction by
W. Grassl (Rodopi, Amsterdam, 1982)
30. S. Shapiro, Computability, proof, and open-texture, in Church’s Thesis After 70 Years, ed. by
A. Olszewski, J. Woleński, R. Janusz (Ontos, Frankfurt, 2006), pp. 420–455
31. S. Shapiro, The open-texture of computability, in Computability: Gödel, Turing, Church, and
Beyond, ed. by J. Copeland, C. Posy, O. Shagrir (The MIT Press, Cambridge, MA, 2013), pp.
153–181
Why Turing’s Thesis Is Not a Thesis
Abstract In 1936 Alan Turing showed that any effectively calculable function is
computable by a Turing machine. Scholars at the time, such as Kurt Gödel and
Alonzo Church, regarded this as a convincing demonstration of this claim, not as a
mere hypothesis in need of continual reexamination and justification. In 1988 Robin
Gandy said that Turing’s analysis “proves a theorem.” However, Stephen C. Kleene
introduced the term “thesis” in 1943 and in his book in 1952. Since then it has been
known as “Turing’s Thesis.” Here we discuss whether it is a thesis, a definition, or
a theorem. This is important to determine what Turing actually accomplished.
1 Introduction
Charles Sanders Peirce in Ethics of Terminology [39, p. 129] wrote that science
advances upon precision of thought, and precision of thought depends upon
precision of terminology. He described the importance of language for science this
way.
the woof and warp of all thought and all research is symbols, and the life of thought and
science is the life inherent in symbols; so that it is wrong to say that a good language is
important to good thought, merely; for it is of the essence of it.
Next would come the consideration of the increasing value of precision of thought as it
advances.
thirdly, the progress of science cannot go far except by collaboration; or, to speak more
accurately, no mind can take one step without the aid of other minds.
with the dictionary and scientific use of the term at that time, but it became firmly
established anyway.
6. Church’s Thesis (CT), so named by Kleene [30], is Church’s assertion [2] that
a function is effectively calculable iff it is Herbrand-Gödel recursive. This is
extensionally equivalent to Turing’s Thesis 1.1 because the recursive functions
coincide formally with the Turing computable ones. However, because of our
precise use of terms and concepts, we wish to make an intensional distinction
between CT and TT because Church’s demonstration of CT was less convincing
than Turing’s demonstration of TT. Therefore, in this paper we do not identify
the two, and we do not use the term Church–Turing Thesis (CTT) introduced by
Kleene [30].
2 What is a Thesis?
The English term1 “thesis” comes from the Greek word K ˛&, meaning “some-
thing put forth.” In logic and rhetoric it refers to a “proposition laid down or stated,
especially as a theme to be discussed and proved, or to be maintained against
attack.”2 It can be a hypothesis presented without proof, or it can be an assertion
put forward with the intention of defending and debating it. For example, a PhD
thesis is a dissertation prepared by the candidate and defended by him before the
faculty in order to earn a PhD degree.
In music, a thesis is a downbeat anticipating an upbeat. In Greek philosophy
dialectic is a form of reasoning in which propositions (theses) are put forward,
followed by counter-propositions (antitheses), and a debate to resolve them. In
general, a thesis is something laid down which expects a response, often a counter
response.
The Harvard online dictionary says that a thesis is “not a topic; nor is it a fact;
nor is it an opinion.” A theorem such as the Gödel Completeness Theorem is not a
thesis. It is a fact with a proof in a formal axiomatic system, ZFC set theory.
However, we use the term “proof” in a more general sense in our everyday world
to evaluate what is true and what is not. For example, in a trial, a jury must decide
whether the prosecution has proved the defendant guilty “beyond a reasonable
doubt.” This requires the evaluation of evidence which cannot be strictly formulated
with well-formed formulas, but whose truth is often just as important in our daily
lives as that of formal mathematical statements. We develop a sense of judgment for
a demonstration and we depend upon evidence and logic to verify its validity. We
often refer informally to that verification as a “proof” of the assertion.
The conclusion is that a theorem or fact is a proposition which is laid down
and which may possibly need to be demonstrated, but about which there will be no
1
See the Oxford English Dictionary, the Websters International Dictionary, and Wikipedia.
2
Oxford English Dictionary.
300 R.I. Soare
debate. It will be accepted in the future without further question. A thesis is a weaker
proposition which invites debate, discussion, and possibly repeated verification.
Attaching the term “thesis” to such a proposition invites continual reexamination. It
signals that to the reader that the proposition may not be completely valid, but rather
it should continually be examined more critically.
This is not a question of mere semantics, but about what Turing actually
achieved. If we use the term “thesis” in connection with Turing’s work, then we
are continually suggesting some doubt about whether he really gave an authentic
characterization of the intuitively calculable functions. The central question of this
paper is to consider whether Turing proved his assertion beyond any reasonable
doubt or whether it is merely a thesis, in need of continual verification.
The concept of an effectively calculable function emerged during the 1920s and
1930s until the work of Turing [65] gave it a definitive meaning. Kleene wrote in
[33, p. 46] that the objective was to find a decision procedure for validity “of a given
logical expression by a finite number of operations” as stated in [25, pp. 72–73].
Hilbert characterized this as the fundamental problem of mathematical logic. Davis
in [8, p. 108] wrote, “This was because it seemed clear to Hilbert that with the
solution of this problem, the Entscheidungsproblem, it should be possible, at least
in principle, to settle all mathematical questions in a purely mechanical manner.”
Von Neumann doubted that such a procedure existed but had no idea how to prove
it when he wrote the following in [70, pp. 11–12].
. . . it appears that there is no way of finding the general criterion for deciding whether or
not a well-formed formula is provable. (We cannot at the moment establish this. Indeed, we
have no clue as to how such a proof of undecidability would go.) . . . the undecidability is
even the conditio sine qua non for the contemporary practice of mathematics, using as it
does heuristic methods to make any sense.
The very day on which the undecidability does not obtain anymore, mathematics as we
now understand it would cease to exist; it would be replaced by an absolutely mechanical
prescription (eine absolut mechanische Vorschrift), by means of which anyone could decide
the provability or unprovability of any sentence.
Thus, we have to take the position: it is generally undecidable, whether a given well-formed
formula is provable or not.
G.H. Hardy made similar comments quoted in [38, p. 85]. However, neither
Hardy, von Neumann, nor Gödel had a firm definition of the informal concept of
an effectively calculable function.
Why Turing’s Thesis Is Not a Thesis 301
At the beginning of 1936 there was a stalemate in Princeton. Church [2] and Kleene
[27] believed in Church’s Thesis, but Gödel did not accept it, even though it was
based on his own definition [14] of recursive functions. Gödel was looking not
merely for a mathematical definition such as recursive functions or Turing machines,
but also for a convincing demonstration that these captured the intuitive concept of
effectively calculable.
In the Nachlass printed in [21, p. 166] Gödel wrote,
When I first published my paper about undecidable propositions the result could not be
pronounced in this generality, because for the notions of mechanical procedure and of
formal system no mathematically satisfactory definition had been given at that time. . . . The
essential point is to define what a procedure is.
Gödel believed that Turing had done so but he was not convinced by Church’s
argument.
Turing’s results in 1936 are often presented first as the Turing machine in the order
Turing wrote it, and second the definition of effectively calculable function at the
end of his paper. However, just as remarkable is Turing’s definition of a function
calculated by an idealized human being in Sect. 9. Turing [65, p. 249] wrote as
follows:
No attempt has yet been made to show that the computable numbers include all numbers
which would naturally be regarded as computable. All arguments which can be given are
bound to be, fundamentally, appeals to intuitions, and for this reason rather unsatisfactory
mathematically. The real question at issue is ‘What are the possible processes which can be
carried out in computing a number?’ The arguments which I shall use are of three kinds.
(a) A direct appeal to intuition. (b) A proof of the equivalence of two definitions (in case
the new definition has a greater intuitive appeal). (c) Giving examples of large classes of
numbers which are computable.
Turing devoted his attention to (a) rather than to (b) or (c). We shall not repeat
Turing’s argument for (a) which was clearly presented and which has also been
given in Kleene [30] Sect. 70. The key point is that Turing broke up the steps of a
procedure into the smallest pieces which could not be further subdivided. Turing’s
argument was convincing to scholars in 1936 and to most scholars today. When one
goes through Turing’s analysis of a procedure one is left with something very close
to a Turing machine which is designed to carry out these atomic acts.
302 R.I. Soare
He also wrote,
Turing’s work gives an analysis of the concept of mechanical procedure, . . . . This concept
is shown to be equivalent with that of a Turing machine.
In a number of books and articles for the Turing Centennial, and in earlier papers,
several prominent authors have made comments about the issues which we address
in our present paper. We now address some of these authors one by one. Each
provides one more piece in the spectrum of our entire analysis.
7.1 Gandy
Robin Gandy [11, 12] was one of the first to challenge Kleene’s claim that Turing’s
Thesis could not be proved. Indeed Gandy claimed that Turing [65] had already
proved it. Gandy analyzed Turing’s argument in detail and wrote [12, pp. 82–84]:
Turing’s analysis does much more than provide an argument for Turing’s Thesis, it proves
a theorem.
Turing’s analysis makes no reference whatsoever to calculating machines. Turing
machines appear as a result, a codification, of his analysis of calculations by humans.
7.2 Sieg
Wilfried Sieg [47, 48, 52] and elsewhere analyzed Turing’s Thesis. He gave an
axiomatization for Turing’s computorable functions and proved that any function
satisfying the axioms is Turing computable. Sieg defined again [52, p. 189] his
notion of a computor, a human computing agent who proceeds mechanically. Sieg
wrote, “No appeal to a thesis is needed; rather, that appeal has been replaced by the
task of recognizing the correctness of the axioms for an intended notion.”
Why Turing’s Thesis Is Not a Thesis 305
The footnote states, “Soare in his articles [55, 56] has justifiably made consid-
erable efforts to reconfigure the subject so as to emphasize its roots in computation
rather than recursion, . . . ”
Feferman continues:
In his influential book, Introduction to Metamathematics, Kleene [1952] baptized the
statement that every effectively calculable function is general recursive as Church’s Thesis.
He went on to baptize as Turing’s Thesis the statement that “every function which would
naturally be regarded as computable is computable . . . by one of his machines . . .
Feferman’s account of the history closely agrees with ours given above.
Turing’s Thesis for concrete structures is very similar to the analysis we have
given, and this has been considered by Gandy, Sieg, Dershowitz and Gurevich, and
others. Feferman then moves to abstract structures which we omit here in order to
give a more convincing argument for the concrete case as in TT Thesis 1.1.
306 R.I. Soare
Feferman [10, Sect. 2] agrees with our analysis above of the origin of the terms
introduced by Kleene [30] and the ambiguity in Kleene’s naming of CTT. Feferman
writes at the end of Sect. 2:
As described in sec. 1 above, it was Kleene [1952] p. 300 et seq, who led one to talk of
Church’s Thesis, Turing’s Thesis, and then, ambiguously, of the Church–Turing Thesis for
the characterization through these equivalences of the effectively calculable functions. In
another influential text, Rogers [1967] pp. 18ff, took the argument by confluence as one of
the basic pins for CT and used that to justify informal proofs by Church’s Thesis.
Kripke’s main point [38] is to suggest another way of proving the much stronger
form of Turing’s Thesis 1.2 even for machine computable functions using the Gödel
Completeness Theorem. Kripke begins by stating what he calls “Hilbert’s Thesis”:
namely, that the steps of any mathematical argument can be given in a language based on
first order logic (with identity). The present argument can be regarded as either reducing
Church’s thesis to Hilbert’s thesis, or alternatively as simply pointing out a theorem on all
computations whose steps can be formalized in a first-order language.
Kripke concluded that his paper does not give an outright proof of Turing’s
Thesis from the Gödel Completeness Theorem but reduces one hypothesis to
another and directly relates the former to the latter. The only open part of this
intended proof is to demonstrate that any machine computation can be represented in
first order logic. In view of the questions surrounding a characterization of what is an
algorithm in Sect. 9 and elsewhere, this seems like a very interesting but formidable
program.
Of course, if successful, our present program would be a consequence if we agree
that Turing’s definition of a computorable function can be represented in first order
logic. We are proposing here, along with Gandy and Sieg, that Turing’s is clearly the
correct representation of the informal notion of effectively calculable, and therefore
Turing’s Thesis 1.1 has a proof parallel to (rather than as a consequence of) Gödel’s
Completeness Theorem. Nevertheless, we agree with Kripke that the connection is
very important.
Turing’s last published paper discussed puzzles. Turing wrote of his central assertion
(about a function being effectively calculable iff it is computable by a Turing
machine) that this assertion lies somewhere between a theorem and a definition.
In so far as we know a priori what is a puzzle and what is not, the statement is a theorem.
In so far as we do not know what puzzles are, the statement is a definition that tells us
something about what they are.
In any case, most scholars agree that it is not a thesis. It is not something laid
down for continual verification or debate.
Acknowledgements The present paper was partially supported by the Isaac Newton Institute of
the University of Cambridge during 2012 and partially by grant # 204186 Computability Theory
and Applications to Robert Soare funded by the Simons Foundation Program for Mathematics and
the Physical Sciences.
308 R.I. Soare
References
30. S.C. Kleene, Introduction to Metamathematics (Van Nostrand, New York, 1952). Ninth reprint
(Walters-Noordhoff Publishing Co./North-Holland, Gröningen/Amsterdam, 1988)
31. S.C. Kleene, Mathematical Logic (Wiley, New York/London/Sydney, 1967)
32. S.C. Kleene, Origins of recursive function theory. Ann. Hist. Comput. 3, 52–67 (1981)
33. S.C. Kleene, The theory of recursive functions, approaching its centennial. Bull. Am. Math.
Soc. (n.s.) 5, 43–61 (1981)
34. S.C. Kleene, Algorithms in various contexts, in Proceedings Symposium on Algorithms
in Modern Mathematics and Computer Science (dedicated to Al-Khowarizimi) (Urgench,
Khorezm Region, Uzbek, SSSR, 1979) (Springer, Berlin/Heidelberg/New York, 1981)
35. S.C. Kleene, Reflections on Church’s thesis. Notre Dame J. Formal Log. 28, 490–498 (1987)
36. S.C. Kleene, Turing’s analysis of computability, and major applications of it, in [24, pp. 17–54]
37. S.C. Kleene, E.L. Post, The upper semi-lattice of degrees of recursive unsolvability. Ann. Math.
59, 379–407 (1954)
38. S. Kripke, The Church–Turing “Thesis” as a special corollary of Gödel’s completeness
theorem, in Computability: Gödel, Church, Turing, and Beyond, ed. by J. Copeland, C. Posy,
O. Shagrir (MIT Press, Cambridge, 2013)
39. C.S. Peirce, Book 2. Speculative grammar, in Collected Papers of Charles Sanders Peirce, ed.
by C. Hartshorne, P. Weiss. Elements of Logic, vol. 2 (The Belknap Press of Harvard University
Press, Cambridge/London, 1960)
40. E.L. Post, Finite combinatory processes—formulation I. J. Symb. Log. 1, 103–105 (1936).
Reprinted in [8, pp. 288–291]
41. E.L. Post, Absolutely Unsolvable Problems and Relatively Undecidable Propositions: Account
of an Anticipation (Submitted for publication in 1941). Printed in [8, pp. 340–433]
42. E.L. Post, Formal reductions of the general combinatorial decision problem. Am. J. Math. 65,
197–215 (1943)
43. E.L. Post, Recursively enumerable sets of positive integers and their decision problems. Bull.
Am. Math. Soc. 50, 284–316 (1944)
44. E.L. Post, Degrees of recursive unsolvability: preliminary report (abstract). Bull. Am. Math.
Soc. 54, 641–642 (1948)
45. H. Rogers Jr., Theory of Recursive Functions and Effective Computability (McGraw-Hill,
New York, 1967)
46. G.E. Sacks, Mathematical Logic in the 20th Century (Singapore University Press, Singapore,
2003). Series on 20th Century Mathematics, vol. 6 (World Scientific Publishing Co.,
Singapore/New Jersey/London/Hong Kong, 2003)
47. W. Sieg, Mechanical procedures and mathematical experience, in Mathematics and Mind, ed.
by A. George (Oxford University Press, Oxford, 1994), pp. 71–117
48. W. Sieg, Step by recursive step: Church’s analysis of effective calculability. Bull. Symb. Log.
3, 154–180 (1997)
49. W. Sieg, Gödel on computability. Philos. Math. 14, 189–207 (2006)
50. W. Sieg, Church without dogma—axioms for computability, in New Computational Paradigms,
ed. by B. Lowe, A. Sorbi, B. Cooper (Springer, New York, 2008), pp. 139–152
51. W. Sieg, On computability, in Handbook of the Philosophy of Mathematics, ed. by A.D. Irvine
(Elsevier, Amsterdam, 2009), pp. 535–630
52. W. Sieg, Gödel’s philosophical challenge [to Turing], in Computability: Gödel, Church, Turing,
and Beyond, ed. by J. Copeland, C. Posy, O. Shagrir (MIT Press, Cambridge, 2013), pp.
183–202
53. W. Sieg, J. Byrnes, K-graph machines: generalizing Turing’s machines and arguments. Preprint
(1995)
54. R.I. Soare, Recursively Enumerable Sets and Degrees: A Study of Computable Functions and
Computably Generated Sets (Springer, Heidelberg, 1987)
55. R.I. Soare, Computability and recursion. Bull. Symb. Log. 2, 284–321 (1996)
56. R.I. Soare, The history and concept of computability, in Handbook of Computability Theory,
ed. by E.R. Griffor (North-Holland, Amsterdam, 1999), pp. 3–36
310 R.I. Soare
57. R.I. Soare, Computability and incomputability, in Computation and Logic in the Real World,
ed. by S.B. Cooper, B. Löwe, A. Sorbi. Proceedings of the Third Conference on Computability
in Europe, CiE 2007, Siena, Italy, June 18–23, 2007. Lecture Notes in Computer Science, vol.
4497 (Springer, Berlin/Heidelberg, 2007)
58. R.I. Soare, Turing oracle machines, online computing, and three displacements in computabil-
ity theory. Ann. Pure Appl. Log. 160, 368–399 (2009)
59. R.I. Soare, An interview with Robert Soare: reflections on Alan Turing, in Crossroads of the
ACM, vol. 18, no. 3, pp. 15–17, (2012)
60. R.I. Soare, Formalism and intuition in computability theory. Philos. Trans. R. Soc. A 370,
3277–3304 (2012)
61. R.I. Soare, Turing and the art of classical computability, in Alan Turing—His Work and Impact,
ed. by B. Cooper, J. Leeuwen (Elsevier, New York, 2013)
62. R.I. Soare, Interactive computing and Turing-Post relativized computability, in Computability:
Gödel, Church, Turing, and Beyond, ed. by J. Copeland, C. Posy, O. Shagrir (MIT Press,
Cambridge, 2013)
63. R.I. Soare, Turing and the discovery of computability, in Turing’s Legacy: Developments from
Turing’s Ideas in Logic, ed. by R. Downey. Lecture Notes in Logic (Association for Symbolic
Logic/Cambridge University Press, Cambridge, 2013)
64. R.I. Soare, The Art of Turing Computability: Theory and Applications. Computability in
Europe Series (Springer, Heidelberg, 2014)
65. A.M. Turing, On computable numbers, with an application to the Entscheidungsproblem. Proc.
Lond. Math. Soc. Ser. 2 42(Parts 3 and 4), 230–265 (1936). Reprinted in [8], pp. 116–154
66. A.M. Turing, A correction. Proc. Lond. Math. Soc. 43, 544–546 (1937)
67. A.M. Turing, Computability and -definability. J. Symb. Log. 2, 153–163 (1937)
68. A.M. Turing, Systems of logic based on ordinals. Proc. Lond. Math. Soc. 45, 161–228 (1939).
Reprinted in [8, pp. 154–222]
69. J. van Heijenoort (ed.), From Frege to Gödel, A Sourcebook in Mathematical Logic, 1879–1931
(Harvard University Press, Cambridge, 1967)
70. J. von Neumann, Zur Hilbertschen Beweistheorie. Math. Z. 26, 1–46, 11–12 (1927)
Incomputability Emergent, and Higher Type
Computation
S. Barry Cooper
Some have asked why neuroscience has not yet achieved results as spectacular as
those seen in molecular biology over the past four decades. Some have even asked
what is the neuroscientific equivalent of the discovery of DNA structure, and whether
or not a corresponding neuroscientific fact has been established. There is no such
single correspondence : : : the equivalent, at the level of mind-producing brain,
has to be a large-scale outline of circuit and system designs, involving descriptions
at both microstructural and macrostructural levels.—Damasio [2, p. 260]
1 Introduction
When the Earl of Northampton gave rise (so the Oxford English Dictionary tells
us) to the first recorded use of the word ‘incomputable’ back in 1606, the word
simply referred to large finite sets being too big to keep track of. With today’s
radically extended notion of computability, ‘incomputable’ is something more far
reaching. For the Earl of Northampton it meant that his algorithm for counting
took more time than he could draw on. It may have meant having an army too
big to submit to a role call. For us today, it would mean the counting algorithm
eludes us—more like a failure to carry out a role call due to many of the individual
foot-soldiers refusing to answer to their names. Worse than incomputability would
be randomness. The undisciplined or rebellious soldiers may mill chaotically,
impossible to even question in an orderly fashion.
In this brief non-technical introduction, we consider some very basic questions,
such as:
• Where does incomputability come from? Does it arise from a mere mathematical
trick, or from an informational process of common encounter?
• What is its connection with randomness and higher type computation?
• What do such notions have to do with real information, and how do they relate
to the classical Turing model of computation, its derivatives such as cellular
automata, and its embodiments?
• How does the mathematics of higher type computation differ from that of the
classical model? How do these differences play out in the real world, and what
are the practical strategies for dealing with them?
• And—most importantly—how can the mathematics and abstract models help us
deal with the computational complexities arising in science and the humanities?
Or at least validate and help us understand what we do already?
Incomputability Emergent, and Higher Type Computation 313
The aim is to map out some directions in our current thinking about such
questions, and to review some of the ways in which the mathematics of computation
can rescue us from current confusions, across a whole spectrum of disciplines. We
are on the threshold of a new understanding of the causal character of reality, no less
than a new causal paradigm replacing that passed down to us from Isaac Newton’s
time via Laplace’s predictive demon. And it is a revolution in thinking about the
world which is as much in need of appropriate mathematical underpinnings as was
that of 350 years ago. 2004 saw the publication by Cambridge University Press
of a collection of 30 contributions from leading thinkers on the subject of Science
and Ultimate Reality: Quantum Theory, Cosmology and Complexity (edited by John
Barrow, Paul Davies and Charles Harper, Jr.). As George Ellis [23] describes (on p.
607) in his contribution on True complexity and its associated ontology:
True complexity involves vast quantities of stored information and hierarchically organized
structures that process information in a purposeful manner, particularly through imple-
mentation of goal-seeking feedback loops. Through this structure they appear purposeful
in their behavior (“teleonomic”). This is what we must look at when we start to extend
physical thought to the boundaries, and particularly when we try to draw philosophical
conclusions—for example, as regards the nature of existence—from our understanding of
the way physics underlies reality. Given this complex structuring, one can ask, “What is
real?”, that is, “What actually exists?”, and “What kinds of causality can occur in these
structures?”.
trick used in the definition of the universal Turing machine, like Kurt Gödel’s coding
of the elements of first-order Peano arithmetic, enables one to formalise properties
of the universal machine taking us to a semantical level not within the purview of the
system; and we examine the logical character of this semantical content, revealing
it to be built on ingredients more widely occurring, in less abstractly mathematical
contexts; in doing this, we need to review the basics of type theory, as a framework
within which to bring together mathematics and embodied logical parallels; finally,
returning to the quotation from Nassim Taleb’s The Black Swan, which introduced
this article, we clarify the relationship between the deceptively familiar notion of
randomness, and the less problematic notion of incomputability which features in
this article.
Section 4 is concerned with the theoretical bridge between the mathematics
we have discussed, and embodied computation and emergent phenomena; we
first examine such mathematically formulated examples as the Mandelbrot set
and Turing’s computational analogues of morphogenic phenomena, both of which
have computer bases and simulations providing persuasive links to more generally
encountered emergence; and the opportunity is taken to connect the parallel
computability theoretic and definitional ingredients of these examples to historically
important intuitions about the wider context; at this point, we take the opportunity
to introduce the fundamental notion of definability, and the related language-
independent one of invariance. Finally, we put definability in the context of existing
proposals from mathematics for higher type computation, explaining how these
theoretical models come with rather different characteristics to those of the classical
lower type one, which fits well with what we experience in the real world.
During Sect. 5, we take our conceptual and technical toolbox into a range of
informative and varied contexts; the aim here is to both exercise our theoretical
framework, and to describe both problems and potential for clarification; and to
prepare the ground for a return to the mathematics in the final Sect. 6, with the
aim of exercising a more informative control over our examples. The main example
chosen in Sect. 5—from artificial intelligence and brain science—is selected for its
challenges, familiarity and rich literature; in particular, higher type computation is
widely identified, via informal discussion, with characteristics of human thought.
The overview of Turing definability in Sect. 6 is no more than a first introduction,
with references to further reading for the more engaged reader.
particular task, the programming was all in the hardware. And that meant you
had to laboriously manufacture a new machine for each kind of computational
chore. Before the modern era, calculating machines exhibiting varying degrees of
programmability were developed (though not always built), including Babbage’s
Analytical Engine, famously programmed by Ada Lovelace, using notes of Italian
mathematician Luigi Menabrea, in the 1840s. But the program was still applied via
what were essentially hardware interventions.
Alan Turing’s key innovation was the Universal Turing Machine, embodied in the
first stored program computers during the late 1940s. The universal machine was
not only programmable to compute anything computable by a Turing machine—
the property of being what we now call Turing complete—but it could accept any
program as data, then decode, store and implement it, and even modify the program
via other stored programs. Not even the program need be embodied as punched
cards or similar, the entire computing activity could be governed by abstract logic
structure. Essentially, the machine had access to a ‘dictionary’ which enabled it
to decode programs from data. This feature was what was programmed into the
universal machine. A machine was not universal by accident. It could be realised
as in early US computers via the von Neumann computer architecture. It was what
we would nowadays identify as the computer system, achieved by a combination
of hardware and software features, but modifiable—and installed, as far as the user
was concerned, via programming.
And logically, the hardware was trivial, and a given Turing machine was its
program. The program could be coded as a number—called the index of the
machine—and fed as data into the Universal Machine. Of course, something very
strange was happening. It has probably occurred to you that your laptop is physically
quite a complicated object: hard to imagine in what sense you could take it and feed
it into your desk-top computer! A full mathematical description of it would need to
reflect its physicality, and would be far from a simple piece of data. But logically,
as a computer, it is a universal Turing machine, and has an index, a mere number.
We say that the logical view has reduced the type of the information embodied
in the computer, from a type-2 set of reals, say, to a type-0 index. The physical
complexities have been packaged in a logical structure, digitally coded. But in the
next section we will see that suppression of one level complexity comes at the cost
of the revelation of new higher order information, based on the enabled ease of
observation of collated computational activities.
The indexing trick arises from being able to describe the functionality of the
machine—what the machine does—via a number. There is a hugely important
assumption underlying such a reduction. Looking at another ‘computing’ object—
say a brain, or a stock exchange—how can we be sure we can reduce its functionality
Incomputability Emergent, and Higher Type Computation 317
to a simple code of its logical structure. The assumption that one can do this in
general is a massive leap of faith. At the level of computers we can currently build,
it follows Turing’s logic, governed by what has come to be called the Church-
Turing Thesis. The extension of the Church-Turing Thesis to organisms in physics or
biology—or economics—is problematic. It is about modeling. It is about comparing
structures, mathematical or real, which may elude computable characterization or
comparison.
This assumption has given rise to a powerful computing paradigm, aspects of
which can be found in many forms. One can think of it as a formal counterpart of the
post-Newtonian determinacy familiar to us via Pierre-Simon Laplace’s predictive
demon. If the functionalist view is as described by Hilary Putnam [5], namely ‘the
thesis that the computer is the right model for the mind’, then one can trace it
back to Alan Turing and the beginnings of the computer age. For example, he is
quoted as saying to his assistant Donald Bayley while working at Hanslope Park in
1944 that he was ‘building a brain’. But the aspect that Putnam develops, with the
full sophistication of the philosopher of mind, is that of mental computation being
realizable on different hardware (see his Minds and Machines [5] from 1960). One
finds echoes of this outlook in computer science in the fruitful notion of a virtual
machine. The practical emergence of the idea of virtualizing hardware is traceable
back to IBM researchers of the mid-1960s, and is common today across a whole
range of software, including Java and Unix.
However, it is the success of the type-reduction which rebounds on us. At the
beginning of the last century it was a disregard for the typing of information which
led to the set-theoretical paradoxes. A few decades later, it was Turing’s deliberate
type-reduction, which—as well as anticipating the stored-program computer—
‘computed’ a type-1 object beyond the reach of his computational model.
So what can possibly be incomputable about a computer—an instantiation of a
universal Turing machine U, say? Let us define the set H to be the set of all pairs
(n, s) such that the machine U with input n halts at step s of the computation. H is
clearly computable—to decide if (n, s) is in H, all one has to do is to give U input n,
and follow the computation for at most s steps to tell if the machine halts at step s.
But say one asks: Does there exist a step s such that (n, s) is in H? The new set can
be described more formally as the set H* of numbers n such that (9s)[(n, s) 2 H] that
is, the computable relation (n, s) 2 H with an existential quantifier added. And for
arbitrary n we cannot decide (9s)[(n, s) 2 H] computably, since it is quite possible
that the computation never halts, and the only way of being sure that happens is to
complete an infinite search for a number s such that (n, s) 2 H, so identifying that
(n, s) 62 H. This incomputable set of numbers H* is algorithmically connected to H:
If we imagine H arrayed in the usual way in 2-dimensional Euclidean space, H* is
just the shadow of H under sunlight from directly above the array. We say H* is the
projection of H onto the n-axis. It can be shown formally that this application of
projection, using an existential quantifier, does give an incomputable set.
Notice, the route by which we get incomputability is by observing the behaviour
of U over its global domain of computation. And remember too that global
relations over computationally modeled environments are commonly important to
318 S.B. Cooper
This classical typing was that used by Russell (see his 1903 The Principles of
Mathematics) as a more careful way of observing and structuring set theoretical
information, to avoid the paradoxes which had emerged out of the earlier work
of Cantor and Frege. The geometry of typed information—numbers, reals, 3-
dimensional objects as curves and surfaces, relations between shapes, and so
on—gives us a sense of the way in which the mathematics of such a hierarchy
can play an embodied role in the real world. Clearly ‘big information’ occurs in
many contexts, not least in the economic or neuro-scientific domains. We also get
a sense of how this embodiment has the potential to take reality out of the classical
computational domain.
The notion of randomness is sometimes used to describe contexts lacking
predictability, as an apparently more intuitive notion than that of incomputability.
For instance, Nassim Taleb does not make a distinction between ‘incomputable’ and
‘random’ in his best-selling book [1] The Black Swan. In fact, he does not mention
incomputability.
Unfortunately, randomness turns out to be a much more complicated, and less
robust, notion than common sense dictates (see Downey and Hirschfeldt [7]). There
is no such thing as randomness as an absolute property, which raises questions
Incomputability Emergent, and Higher Type Computation 319
But here he is again (p. 89 of Is the Causal Structure of the Physical Itself
Something Physical? in Realism with a Human Face, [9]), modifying his earlier
view in the broader context:
: : : if the physical universe itself is an automaton : : : , then it is unintelligible how any
particular structure can be singled out as “the” causal structure of the universe. Of course,
the universe fulfills structural descriptions – in some way or other it fulfills every structural
description that does not call for too high a cardinality on the part of the system being
modeled; : : : the problem is how any one structure can be singled out as “the” structure of
the system.
Notice that although the Mandelbrot set is a type-2 object, our type reduction
via a computable sampling of M produces a good enough approximation for us to
mentally ‘compute’ with a good sense of the type-2 richness of emergent geometry
of M. Effectively, the human brain can carry out a good simulation of type-2
computation with special input M. But such reductions are ad hoc. Our brains extract
no such useful information to compute on from H*. There is informative geometry
delivered by the definability of M, but not of H*. The practical and nonuniform
subtleties of reducing ‘big data’ to computable information is illustrated by Turing’s
Incomputability Emergent, and Higher Type Computation 321
use of Bayesian sampling in his decoding work at Bletchley Park in the early 1940s
(see Mardia and Cooper [10]).
Corresponding to the parallel definitions of M and H* we have parallel computa-
tional characteristics. Neither are computable (at least for now), but both H* and the
complement of M can be computably enumerated. The everyday correspondence is
with simulation, once achieved in analog style via wind-tunnels etc, now very often
digitally. The difference between being able to compute a phenomenon, and being
able to simulate every detail of it, is that a simulation is never complete, despite the
computable unfolding of information.
The Mandelbrot set provides an illuminating link between the pure abstraction of
the halting problem, and strikingly embodied examples of emergence in nature. It
helps us gather together physically interesting examples of emergence and abstract
entities with clear mathematical attributes, to the illumination of both.
Turing’s seminal work on morphogenesis in early 1950s Manchester is a
particularly striking illustration of the power of mathematical descriptions to capture
what seem to be surprising or mysterious developments of form in the real
world. The validity of his proposal in the 1952 paper on The Chemical Basis of
Morphogenesis [11] of an underlying reaction–diffusion basis for a wide range of
emergent phenomena has been verified and suitably modified in many contexts. And
this only paper of Turing’s on the topic before he died has become his most cited
paper, ahead of the famous computable numbers and Turing test articles.
Courtesy of P. N. Furbank
Not only did Turing creatively extract the model, but also provided the dif-
ferential equations governing its development in various contexts, and in some
cases—such as that of the emergence of dappling on a cow’s hide—used the
322 S.B. Cooper
of a theory, in which important aspects are not uniquely defined—a bit like the state
of an atomic particle before a measurement is taken.
The second is that there is a language independent version of definability, useful
in an ontological context, since the real universe does not use language to ‘define’
its structure and laws. If a relation on a structure is left entirely undisturbed by any
automorphism of the structure, we say it is invariant. The invariance is an expression
of the ontological uniqueness of the relation on the structure. If the logic adopted to
describe the structure is appropriate, then all definable relations on the structure are
invariant.
Drawing things together: The higher order properties of a structure—those which
are so important for us to understand in the real world—are the emergent relations
of the structure. These large-scale relations are commonly based on reasonably
well-understood local structure. Mathematically, we have formalised this in terms
of definability—or, as invariance under automorphisms over basic computational
structure.
And, to add a final element of the jigsaw: Such definability can be viewed as
computation over higher type data. If a structure uniquely pins down a relation,
in the sense that there is no alternative presentation of the structure that changes
the relation in our observational purview, then we have to conclude that, in some
sense, the complexity of the structure has computed that relation. There is, of
course, a rich theory of higher type computation (starting with Stephen Kleene,
Robin Gandy, Georg Kreisel, Gerald Sacks etc.), which makes explicit how we
see its computational infrastructure. John Longley’s 2005 survey paper [13] is an
excellent guide to the field: and forthcoming is his book with Dag Normann [14] on
Computability at Higher Types.
We cannot expect computation over higher type information to have the reliability
or precision of the classical model. And it is at the level of the human brain and
its hosting of complex mentality that we can expect this to be specially apparent.
Here is Alan Turing, the advocate of computer intelligence, in a talk to the London
Mathematical Society, on February 20, 1947 (quoted in Hodges [15], p. 361):
: : : if a machine is expected to be infallible, it cannot also be intelligent. There are several
theorems which say almost exactly that.
And here he is again, in the final paragraph of his popular article Solvable and
Unsolvable Problems in Penguin Science News 31, 1954, p. 23 (p. 322 of the
Impact book [3]), suggesting that human ‘common sense’ is something additional
to algorithmic thinking:
The results which have been described in this article are mainly of a negative character,
setting certain bounds to what we can hope to achieve purely by reasoning. These, and
some other results of mathematical logic may be regarded as going some way towards a
324 S.B. Cooper
One might also look for emergence, in the form of surprisingly emergent
outcomes. As mentioned previously, mathematical creativity was something Henri
Poincaré was keenly interested in, and Jacques Hadamard in his 1945 book [16] on
The Psychology of Invention in the Mathematical Field (Princeton University Press)
refers to Poincaré’s celebrated 1908 lecture at the Société de Psychologie in Paris,
where he recounts getting stuck solving a problem related to elliptic functions:
At first Poincaré attacked [a problem] vainly for a fortnight, attempting to prove there could
not be any such function : : : [quoting Poincaré]:
‘Having reached Coutances, we entered an omnibus to go some place or other. At the
moment when I put my foot on the step, the idea came to me, without anything in my
former thoughts seeming to have paved the way for it : : : I did not verify the idea : : : I
went on with a conversation already commenced, but I felt a perfect certainty.
On my return to Caen, for conscience sake, I verified the result at my leisure.’
What is striking about this account is not so much the surprising emergence, as
the sense we have of there being some definitional ownership taken of the proof,
enabling Poincaré to carry home the proof as if packaged in the proof shop. Of
course, Poincaré would have known nothing of emergence in 1908 (even though the
concept is traceable back to John Stuart Mill’s 1843 System of Logic), and though
chance might be playing a role, an idea echoed by Hadamard in his book.
At the neuroscientific level, we have a huge body of commentary on the ‘left
brain—right brain’ dichotomy, and the evidence of them hosting different kinds of
thinking—according to our perspective, different types of computational activity.
In his 2009 book [17] The Master and his Emissary: The Divided Brain and the
Making of the Western World, Iain McGilchrist describes how:
The world of the left hemisphere, dependent on denotative language and abstraction,
yields clarity and power to manipulate things that are known, fixed, static, isolated,
decontextualised, explicit, disembodied, general in nature, but ultimately lifeless. The right
hemisphere by contrast, yields a world of individual, changing, evolving, interconnected,
implicit, incarnate, living beings within the context of the lived world, but in the nature of
things never fully graspable, always imperfectly known—and to this world it exists in a
relationship of care. The knowledge that is mediated by the left hemisphere is knowledge
within a closed system. It has the advantage of perfection, but such perfection is bought
ultimately at the price of emptiness, of self-reference. It can mediate knowledge only in
terms of a mechanical rearrangement of other things already known. It can never really
‘break out’ to know anything new, because its knowledge is of its own representations only.
Where the thing itself is present to the right hemisphere, it is only ‘re-presented’ by the left
hemisphere, now become an idea of a thing. Where the right hemisphere is conscious of the
Other, whatever it may be, the left hemisphere’s consciousness is of itself.
This description of the functionality of the brain fits surprisingly well our
developing awareness of the differences between type-1 and type-2 computation;
shadowing the differences between the algorithmic, as embodied in today’s com-
puter, and the more open, creative thinking that the human brain appears capable of.
Incomputability Emergent, and Higher Type Computation 325
Kim asks various questions, which make it clear we are in need of an updated
model, and one which delivers a coherent overview of the interactions of these
domains which are so basic to our everyday lives. The sort of questions one is
driven to face up to include: How can mentality have a computational role in
a world that is fundamentally physical? A highly reductive physicalist view, in
which the effective role of such phenomena as consciousness become illusory,
has distinguished protagonists. For instance, in Turing’s ‘Strange Inversion of
Reasoning’ (Alan Turing: His Work and Impact, [3], pp. 569–573), Daniel Dennett
finds that a gradualist Darwinian model, and parallel thinking of Turing concerning
intelligent machines, a useful antidote to those who ‘think that a Cartesian res
cogitans, a thinking thing, cannot be constructed out of Turing’s building blocks’.
And what about overdetermination—the problem of phenomena having both mental
and physical causes? In a book whose title points to the quest for a convincing
Physicalism, or Something Near Enough (Princeton, 2005), Jaegwon Kim [19]
nicely expresses the problem we have physically connecting mentality in a way
that gives it the level of autonomy we observe:
: : : the problem of mental causation is solvable only if mentality is physically reducible;
however, phenomenal consciousness resists physical reduction, putting its causal efficacy
in peril.
The intuition is that the model may support an emergent form of incomputability,
the interactivity supporting a more clearly embodied example than that derivable via
the Turing machine model. Is that all?
Such functionality does not impress Steven Pinker (for instance) for whom
‘neural networks alone cannot do the job’. Pinker—echoing Floridi’s [20] ‘levels
of abstraction’ in the mental context—identifies [21] in How the Mind Works, a
‘kind of mental fecundity called recursion’, with an example:
We humans can take an entire proposition and give it a role in some larger proposition. Then
we can take the larger proposition and embed it in a still-larger one. Not only did the baby
eat the slug, but the father saw the baby eat the slug, and I wonder whether the father saw
the baby eat the slug, the father knows that I wonder whether he saw the baby eat the slug,
and I can guess that the father knows that I wonder whether he saw the baby eat the slug,
and so on.
The physicalism is still conceptually challenging, but contains the sort of ingre-
dients which echo the higher type computation we identify within the messiness
and strange effectiveness of right-hemisphere functionality. At the same time, one
detects a complexity of interaction involving representation, sampling and looping
between types/levels of abstraction, that is characteristic of what George Ellis [23]
calls “true complexity”.
This is a complexity of scenario reinforced by experience of trying to build
intelligent machines. The lesson of the neural net insufficiency is that the basic
logical structure contains the seeds of computation beyond the Turing barrier, but
that the embodiment needs to be capable of hosting mechanisms for type reduction,
employing the sort of representational facility Damasio observes in the brain,
and enabling the kind of recursions that Pinker demands. In artificial intelligence
the purely logical approach has encountered unsurmountable difficulties. While
some have been driven to an ad hoc empirical approach, following the recipe for
machine-human interaction and learning anticipated by Turing. Once again, we see
a community showing the effects of paradigmic confusion. From Rodney Brooks in
Nature in 2001 we have:
: : : neither AI nor Alife has produced artifacts that could be confused with a living organism
for more than an instant.
Incomputability Emergent, and Higher Type Computation 327
References
1. N. Taleb, The Black Swan: The Impact of the Highly Improbable (Allen Lane, London, 2007)
2. A. Damasio, Descartes’ Error: Emotion, Reason and the Human Brain (G.P. Putnam’s Sons,
New York, 1994)
3. S.B. Cooper, J. van Leeuwen (eds.), Alan Turing: His Work and Impact (Elsevier, Amsterdam,
2013)
4. S.B. Cooper, Computability Theory (Chapman & Hall/CRC, London, 2004)
5. H. Putnam, Minds and Machines. Reprinted in Putnam, Mind, Language and Reality. Philo-
sophical Papers, vol. 2, (Cambridge University Press, 1975)
6. K. Gödel, Russell’s mathematical logic, in The Philosophy of Bertrand Russell, ed. by P.A.
Schilpp (Northwestern University, Evanston and Chicago, 1944), pp. 123–153
7. R.G. Downey, D.R. Hirschfeldt, Algorithmic Randomness and Complexity (Springer,
New York, 2010)
8. C.S. Calude, K. Svozil, Quantum randomness and value indefiniteness. Adv. Sci. Lett. 1(2),
165–168 (2008)
9. H. Putnam, Realism with a Human Face (Harvard University Press, Cambridge, 1990)
10. K.V. Mardia, S.B. Cooper, Alan Turing and enigmatic statistics. Bull. Brasilian Section Int.
Soc. Bayesian Anal. 5(2), 2–7 (2012)
11. A.M. Turing, The chemical basis of morphogenesis. Philos. Trans. R. Soc. Lond. Ser, B 237,
37–72 (1952)
12. A. Tarski, On Definable Sets of Real Numbers; translation of Tarski (1931) by J.H. Woodger.
In A. Tarski, Logic, Semantics, Metamathematics, 2nd edn, ed. by J. Corcoran (Hackett,
Indianapolis, 1983), pp. 110–142
13. J. Longley, Notions of computability at higher types I, in Logic Colloquium 2000, ed. by R.
Cori, A. Razborov, S. Todorcevic, C. Wood (A K Peters, Natick, 2005)
14. J. Longley, D. Normann, Computability at Higher Types. Theory and Applications of Com-
putability (Springer, Heidelberg, New York)
15. A. Hodges, Alan Turing: The Enigma (Vintage, New York, 1992); reprinted in Centennial
edition, Princeton University Press, 2012
16. J. Hadamard, The Psychology of Invention in the Mathematical Field (Princeton University
Press, Princeton, 1945)
17. I. McGilchrist, The Master and His Emissary: The Divided Brain and the Making of the
Western World (Yale University Press, New Haven, 2009)
18. J. Kim, Mind in a Physical World (MIT Press, Cambridge, 1998)
19. J. Kim, Physicalism, or Something Near Enough (Princeton University Press, Princeton, 2005)
20. L. Floridi, The Philosophy of Information (Oxford University Press, Oxford, 2011)
21. S. Pinker, How the Mind Works (W.W. Norton, New York, 1997)
22. A. Damasio, The Feeling of What Happens: Body and Emotion in the Making of Consciousness
(Harcourt Brace, New York, 1999)
23. G.F.R. Ellis, True complexity and its associated ontology, in Science and Ultimate Reality:
Quantum Theory, Cosmology and Complexity, ed. by J.D. Barrow, P.C.W. Davies, C.L. Harper,
Jr. (eds.) (Cambridge University Press, Cambridge, 2004)
24. A.M. Turing, Systems of logic based on ordinals. Proc. London Math. Soc. 45(2), 161–228
(1939); reprinted in Cooper and van Leeuwen, pp. 151–197
25. H. Rogers, Jr., Some problems of definability in recursive function theory, in Sets, Models
and Recursion Theory (Proceedings of Summer School Mathematical Logic and Tenth Logic
Colloquium, Leicester, 1965) (North-Holland, Amsterdam, 1967), pp. 183–201
26. S.B. Cooper, P. Odifreddi, Incomputability in nature, in Computability and Models: Perspec-
tives East and West, ed. by S.B. Cooper, S.S. Goncharov (Plenum, New York, 2003), pp.
137–160
27. H. Putnam, Why functionalism didn’t work, in Representation and Reality (Putnam, MIT
Press, 1988), pp. 73–89