1976 Savage
1976 Savage
1976 Savage
ON REREADING R. A. FISHER
By LEONARD J. SAVAGE?”
Yale University
'Fisher’s contributions to statistics are surveyed. His background,
skills, temperament, and style of thought and writing are sketched. His
mathematical and methodological contributions are outlined. More atten-
tion is given to the technical concepts he introduced or emphasized, such
as consistency, sufficiency, efficiency, information, and maximum likeli-
hood. Still more attention is given to his conception and concepts of
probability and inference, including likelihood, the fiducial argument, and
hypothesis testing. Fisher is at once very near to and very far from modern
statistical thought generally. |
1. Introduction.
1.1. Why this essay? Of course an R. A. Fisher Memorial Lecture need not
be about R. A. Fisher himself, but the invitation to lecture in his honor set me
so to thinking of Fisher’s influence on my ownstatistical education that I could
not tear myself away from the project of a somewhat personal review of his
work.
Mystatistical mentors, Milton Friedman and W. Allen Wallis, held that
Fisher’s Statistical Methods for Research Workers (RW, 1925) was the serious
Institute of Mathematical Statistics is collaborating with JSTOR to digitize, preserve, and extend access to
The Annals of Statistics. ®
www.jstor.org
442 LEONARD J. SAVAGE
man’s introduction to statistics. They shared that idea with their own admired
teacher, Harold Hotelling. They and someothers, though of course not all, gave
the same advice: ‘“‘To becomea statistician, practice statistics and mull Fisher
over with patience, respect, and skepticism.”
topic. The only deletion of any length or substance (in § 4.2) is footnoted. Most alterations of
the text are for clarity or to bring in relevant ideas expressed elsewhere by Savage. A few amend
or go beyondhis intentions, but only on relatively objective matters where the evidence seems
clear. The editor hastried to keep his personal views on controversial issues to the footnotes.
Complete, unambiguous documentation would, however, have required excessive fussiness and
footnoting. Those concerned with the nature and purposeof the editorialalterations will usually
be able to deduce them, more easily than this description may suggest. Those who wantall evi-
dence on Savage’s thought would need to consult the original materials in any case.
Material prepared by Savagein addition to the manuscript include about 200 index cards, some
with more than one entry; about 50 handwritten pages, which the editor has had typed, of some-
times fascinating ‘‘random notes”’ on many worksby Fisher and a few by others; Savage’s personal
copies, which he sometimes marked, of most of these works and quite a few more; and about 25
other pages of notes, mostly references and lists of topics. A tape of the original lecture was
available and has been transcribed. All these materials were useful in the editing, especially for
filling in references, but they by no meansresolved all problems. They and Savage’s otherscien-
tific papers, including correspondence, are available, excellently indexed, in archives at Sterling
Memorial Library, Yale University.
The editor is grateful for help, especially with references not by Fisher, to the discussants,
most of whom sent separate commentsfor editorial use; and in addition to the following, with
apologies to anyone inadvertently omitted: F. J. Anscombe, G. A. Barnard, M. S. Bartlett,
J. H. Bennett, R. J. Buehler, H. Chernoff, W. G. Cochran, A. P. Dempster, A. W. F. Edwards,
D. J. Finney, J. Gurland, J. A. Hartigan, B. M. Hill, D. G. Kendall, M. G. Kendall, W. H.
Kruskal, E. E. Leamer, L. Le Cam, E. L. Lehmann, F. Mosteller, E. S. Pearson, G.-C. Rota,
I. R. Savage, H. Scheffé, E. L. Scott, and D. L. Wallace. That so many should have responded
so generously is a tribute to Savagein itself. Of course this implicates them in no way.
Citations without dates, in the footnotes, are to these responses. Many interesting reactions
could not be used, however. Eisenhart and Pearson, in particular, each wrote many pages of
commentary of great interest, historical as well as substantive. All the responses are available
in the archives.
Savage would obviously have revised his paper, especially the latter portions, considerably for
Style and somefor substance. After circulating it for reaction, he would presumably have revised
it further. In particular, offense would have been eliminated where he did not intend it but some
now find it. An editor cannot know how he would have madeanyof these kindsof revisions,
however, and any attemptrisks distorting his meaning. Richard Savage and I therefore decided
in the end to let the text stand even where it clearly presents a problem unless a resolution was
also clear. Wetrust readers will make appropriate allowancefor the unfinished state of the man-
uscript. Jimmie Savage once wrote (1954 vii):
One whosoairs his opinions has serious misgivings that (as may be judged from
other prefaces) he often tries to communicate along with his book. First, he longs
to know, for reasons that are not altogether noble, whether he is really making a
valuable contribution: --
Again, what he has written is far from perfect, even to his biased eye. He has
stopped revising and called the book finished, because one must soonerorlater.
Finally he fears that he himself, and still more such public as he has, will forget
that the book is tentative, that an author’s most recent word need not be his last
word.
ON REREADING R. A. FISHER 443
3 Where manycitations are possible, Savage may haveintendedto be selective. In citing Fisher,
rather than impose my ownselection, I have tended to be overinclusive. In citing others than
Fisher, on statistical topics generally, I have aimed to give a few helpful references, but notto
be definitive. To minimize interference with smooth reading, I have used the most compact
feasible style of citation. In particular, the same nameapplies to a string of dates until another
name appears.
ON REREADING R. A. FISHER 445
2.2. Background and skills. Of course Fisher was not specifically trained to
be a statistician. Only after Fisher was a great statistician, and largely because
of the vision of statistics to which his activities gave rise, wasstatistical training
inaugurated in a few universities.‘
Fisher was a Cambridge-trained mathematician (see references in § 1.2), and
despite what sometimes seemsscorn for mathematicians, he was a very good one
(Neyman 1951; 1961 147; 1967) with an extraordinary commandof special func-
tions (1915, 1921 a, 1922c, 1925b, 1925c, 1928a, 1931 b), combinatorics (1942 b,
1942c, 1945a, 1950a, DOE, ST), and truly geometric n-dimensional geometry
(1913, 1915, 1929b, 1940a; see also 1922a, 1922b, 1924a, 1928b, 1929a, 1930a,
1939b). Indeed, my recent reading reveals Fisher as much more of a mathema-
tician than I had previously recognized. I had been misled by his ownattitude
toward mathematicians, especially by his lack of comprehension of, and con-
tempt for, modern abstract tendencies in mathematics (1958a; see also § 2.1;
yet see 1942b esp. 340a; 1945a; DOE § 45.1). Seeing Fisher ignorant of those
parts of mathematics in which I was best trained, I long suspected that his mas-
tery of other parts had been exaggerated, but it now seemsto methatstatistics
has never been served by a mathematician stronger in certain directions than
Fisher was. No complete statistician is merely a mathematician, and Fisher—
like other statisticians of his time—wasa skilled and energetic desk calculator
(RW examples; ST), tabulator (ST; see Index at “Tables” in RW and CMS or
CP), and grapher (RW Ch. 2; 1922a § 10; 1924c; 1928d). He early became a
widely experienced and resourceful applied statistician, mainly in the fields of
agronomy and laboratory biology (see his bibliography; the examples in RW
and DOE; practical suggestions in 1926a; in RW Ch. 2; and in DOE § 10 par.
2, § 12, § 25, § 29, end of § 60).
In addition to Fisher’s illustrious career as a statistician he had one almost
as illustrious as a population geneticist, so that quite apart from his work in
statistics he was a famous, creative, and controversial geneticist (see references
in § 1.2). Even today, I occasionally meet geneticists who ask me whetherit is
true that the great geneticist R. A. Fisher was also an importantstatistician.
Fisher held twochairs in genetics, first at University College, London, and then
at Cambridge, but was never a professorofstatistics.
2.3. Temperament. Fisher burned even more thantherest of us, it seems to
me, to be original, right, important, famous, and respected. And in enormous
* Advancedtraining in theoretical statistics and its application has been available at University
College, Londonsince the 1890’s (Pearson 1974), but Savage’s statementis surely correct in spirit,
and technically as well if ‘‘training’’ means at the doctoral level and ‘‘a few’? means more than
one or two.
446 LEONARD J. SAVAGE
measure, he achievedall of that, though never enough to bring him peace. Not
unnaturally, he was often involved in quarrels, and [though he sometimes dis-
agreed politely (1929f; 1929g; 1930c 204a; 1932 260-1; 1933a; 1936c; 1941 b),
he| sometimes published insults that only a saint could entirely forgive (1922a 329;
1922b 86a; 1935 f; 1937b 302a-318; 1939a 173a; 1941 c 143; 1960 2, 4; SI3, 76-
7, 88, 91, 96, 100-2, 120, 141, 162-3). It is not evident that Fisher always struck
the first blow in these quarrels (K. Pearson, presumably, in Soper et al. 1917 353;
Bowley 1935 55-7; see also E. S. Pearson 1968; Eisenhart 1974), though their
actual roots would bedifficult if not impossible to trace (1922b 91; 1923b), nor
did he always emerge the undisputed champion in bad manners? (K. Pearson 1936;
Neyman 1951). On one occasion, Fisher (1954) struck out blindly against a
young lady who had been anything but offensive or incompetent. His conclu-
sion was that had the lady known what she was about she would have solved a
certain problem in a certain fashion; he was right about that but failed to notice
that she had solved it in just that fashion. Of course, Fisher was by no means
without friends and admirers too. [!Indeed, weareall his admirers. (Yet he
has few articulate partisans in controversies on the foundations ofstatistical
inference, the closest, perhaps, being Barnard (e.g. 1963) and Rao(e.g. 1961).)1
The main point for us in Fisher’s touchiness and involvement in quarrels is
their impediment to communication (van Dantzig 1957; Yates 1962; Yates and
Mather 1963; see also § 2.4). Thosegreatstatisticians who had the most to gain
from understanding him, whether to some extent through their own tactlessness
or otherwise, received the greatest psychological provocationto close their minds
to him. Also, it is hard for a man so driven and so involved in polemic as Fisher
was to recognize in himself and announcea frank changeof opinion except when
he is the first to see the need for it (1922a 326). For example, when Fishersays,
“It has been proposed that...” (SI.172), and then proceeds to smash that proposal
to smithereens, would it occur to you that the proposer wasFisher himself (1935c
395)? Yet specific, technical mistakes he can admit openly (1940b 423) and
even gracefully (1930c 205), and he often mentions weaknesses of his earlier
attempts which he later improved on (1922a 308a; 1922b 86a; 1925a 699a;
1930b 527a; 1930c 204a; SI 54, 56, 142).1
J am surely not alone in having suspected that some of Fisher’s major views
were adopted simply to avoid agreeing with his opponents (Neyman 1961 148-
9). One of the most valuable lessons of my rereading is the conclusion that
while conflict may sometimes have somewhatdistorted Fisher’s presentation of
his views® (Yates 1962 1152), the views themselves display a steady and coherent
development (Barnard 1963 164; Fisher 1920, 1922a, 1924a, 1925a, 1935b;
5 There is good reason to think that Savage would have modified such bad mannersof his own,
but there is no way to eliminate them editorially without danger of distorting his intentions.
6 Fisher says (SI 31) ‘‘[Chrystal’s] case as well as Venn’s illustrates the truth that the best causes
tend to attract to their support the worst arguments, which seemsto be equally true in the in-
tellectual and in the moral sense.”’
ON REREADING R. A. FISHER 447
1926a, 1929d, DOE; Author’s Notes in CMS on these and 1922b, 1922c).
Ideas that I had consistently tuned out until the present reading are to be found
in some of his earliest papers. (See individual topics below for references; see
also 1928b, RW 8 57.)
As in the works of other mathematicians, research for the fun of it is abundant
and beautiful in Fisher’s writings, though he usually apologizes for it’ (1929c;
1942a 305a; 1953a; DOE § 35).
2.4. Predecessors and contemporaries. Fisher had a broad general culture and
was well read in the statistical literature of his past and of his youth (GT; DOE
xv; SI v; 1950b; 1958c; see also the rest of this subsection and the next). To
begin with the oldest, the famous essay by Thomas Bayes (1763) seems to have
been more stimulating to Fisher than to many who, like myself, are called
Bayesians. Recognition of this was slow to come to me because of Fisher’s rejec-
tion of Bayes’ rule and other ‘conventional’ prior distributions (1921a 17; 1922a
311, 324-6; 1930b 528-31; 1932 257-9; 1934a 285-7; RW §5; DOE8 3; SI
Ch. 2), and because he certainly was not a Bayesian in any ordinary sense. His
admiration for Bayes is to be inferred more from Fisher’s attitude to inductive
inference, which he sometimes explicitly links to Bayes (1930b 531; 1934a 285-
6; 1936a 245-7; 1960 2-3, DOE§ 3) and which will be discussedlater, especially
in Section 4.4, than by certain explicit words of praise (RW § 5; DOE 3; SI
8-17; 1934a 285-6; 1936a 245-7. He urged the 1958 reprinting of Bayes (1763);
see p. 295.).
Intellectually, Fisher was a grandson of Francis Galton, whom he greatly
admired (1948 218; SI 1-2; yet DOE § 19 points out a serious error made by
Galton), and a son of Karl Pearson, who was always before Fisher’s eyes as an
inspiration and a challenge (RW § 5; SI 2-4, 141; 1933b 893-4; see also § 2.1
and § 2.6 and Eisenhart 1974), so that Freud too might have called Pearson a
father to Fisher.
Fisher always refers to “Student,” William Sealy Gosset, with respect (1915
507-8; 1922a 315; 1922c 608; 1923a 655; 1924b 807-8; 1936a 252; 1938 16;
1939d; RW § 5; DOE 33; S14, 80-1; he disagrees politely in 1929f, less politely
in 1936c), and their mutual admiration and enduring friendship is reflected in
the collection of letters from “Student” to Fisher (Gosset 1962), which has
the benefit of some annotation by Fisher.
Some of Fisher’s important ideas about likelihood are anticipated in a long
and obscure paper by Edgeworth (1908-9 esp. 506-7, 662, and most especially
82-5). When this was publicly pointed out to Fisher (Bowley 1935) he replied
that there was nothing of value in Edgeworth’s paper that was notinstill older
papers that Fisher was glad to acknowledge (1935b 77). Fisher seems to me to
have underestimated the pertinent elements of Edgeworth’s paper. I doubt that
" Nevertheless, the difficulty of documenting this assertion indicates that only a tiny fraction
of Fisher’s work is mathematical research for the fun of it.
448 LEONARD J. SAVAGE
Fisher ever read it all closely, either before or after the connection was pointed
out, first because it is human to turn away from long and difficult papers Tap-
parently | based on what one takes to be ridiculous premises—inthis case, Bayes’
rule and inverse probability—then later perhaps becauseit is hard to seek dili-
gently for the unwelcome. Rao (1961 209-11) stresses that Fisher’s contribu-
tions to the idea of maximum-likelihood estimation go far beyond those of all
of his predecessors.°
In science, it is hostility rather than familiarity that breeds contempt, andall
of Fisher’s castigation of the Neyman—Pearson school (1934a 296; 1935¢ 393;
1935f; 1939a 173a, 180; 1945b 130; 1955; 1960; SI) shows that he never had
sufficient respect for the work of that school to read it attentively, as will be
brought out frequently in this essay. And membersof that schoolin referring
to Fisher were likely to read their own ideas impatiently into his lines. This
too will be documented by implication during this essay. An interesting study
on the breakdown in communication between the twosides might be based
merely on the discussion following (Neyman1935); and it might well begin with
careful pursuit of the last complete paragraph on page 172 of that discussion.
(See also references in § 2.3 and § 2.5.)
2.5. Style. Fisher’s writing has a characteristic flavor, at once polished and
awkward. It is not pleasant to my taste but is fascinating and can beinspiring.
He has a tendency to be aphoristic and cryptic.* Sometimes things are plainly
said—when you go back and check—butin such a wayasto go unperceived or
even undeciphered whenfirst seen.
Mathematics is ruthlessly omitted from Fisher’s didactic works, Statistical Meth-
ods for Research Workers and The Design of Experiments. In modern mathematical
education there is great repugnanceto transmitting a mathematical fact without
its demonstration. The disciplinary valueof this practice is clear, but, especially
in the mathematical education of nonmathematicians, it can be abused. Many
a high school boy knows more biology, chemistry, and physics, than a dozen men
could demonstrate in a lifetime. Is it not then appropriate for him also to know
more mathematics than he himself can demonstrate? Giving perhaps too affir-
mative a response (RW x, § 4), Fisher freely pours out mathematical facts in
his didactic works without even a bowin the direction of demonstration. I have
encountered relatively unmathematical scholars of intelligence and perseverance
whoare able to learn much from these books, but for most people, time out for
some mathematical demonstrations seems indispensable to mastery (Hotelling
1951 45-6).
* Scraps of other drafts of this paragraph were nearby in Savage’s manuscript. My contribu-
tion following this paper further describes and assesses Edgeworth’s paper.
* Examples culled from Savage’s notes include 1915; 1920 761; 1921 b 119, 122; 1922c 598, 600;
1923b §2; 1924a; 1924 b 807, 810; 1928b; 1930b; 1934a 297; 1935 b 42, 47; RW §57; see also
van Dantzig (1957), Hotelling (1951 38, 45-6), and for his own view and excuses, Fisher (1922 b
86a; 1926a 511; see also § 2.3) and Mahalanobis (1938 265-6).
ON REREADING R. A. FISHER 449
10 This attitude has not been found expressed in Fisher’s writing, and Finney wrote, ‘‘I thought
he always had considerable affection for Snedecor,’’ yet Savage’s statement seems from some
comments received to reflect an oral tradition. Though Fisher tabulated F (ST) and used it in
exposition (DOE § 23), he personally preferred z becauseits distribution varies more regularly
and is more nearly normal, facilitating interpolation and tabulation (RW §41, DOE §23, ST 2).
According to Eisenhart, he also consideredits scale better for expressing departure from the null
hypothesis, and its near normality helpful in combining independent analyses. His incidental
remark (1924 b 808) that z = log s:/sz has modelog oi/a2 piqued Savage.
450 LEONARD J. SAVAGE
Fisher is the undisputed creator (Cochran 1976; Yates and Mather 1963 107-
113; see also Hotelling 1951 42-3; Mahalanobis 1938 271; Neyman 1951 407;
1961 146-7; 1967 1458-9) of the modern field that statisticians call the design
of experiments, both in the broad sense of keeping statistical considerations in
mind in the planning of experiments and in the narrow sense of exploiting
combinatorial patterns in the layout of experiments. [His book Design of Experi-
ments is full of wonderful ideas, many already clearly presented or present in
(1926a).1 I shall mention quite a few of these, discussing one or twoa little,
but am in danger of leaving out several of your favorite ones by oversight. He
preached effectively against the maxim of varying one factor at a time (1926a
511-2; DOE Ch. 6 esp. §§ 37, 39). He taught how to make many comparisons
while basing each on relatively homogeneous material by meansofpartial repli-
cation™ (1926a 513; 1929d 209-12; DOE Ch. 7-8). He taught what should be
obvious but always demandsa second thought from me: if an experimentis laid
out to diminish the variance of comparisons, as by using matched pairs (which
can be very useful), or by adopting a Knut Vik square (which presumably can-
not be made very useful), the variance eliminated from the comparisons shows
up in the estimate of this variance (unless care is taken to eliminate it) so that
as actual precision is gained perceived precision can be lost (1926a 506-7; 1939d
7; DOE 8§ 27, 33, 34). Randomized design, and perhaps even the notion of a
rigorously random sample, seems to originate with Fisher (1925a 700-1; 1926a;
RW § 48; DOE §§ 5, 6, 9, 10, 20 (which seems to claim priority), 22, 31;
Cochran 1975; Neyman 1951), though this technique is so fundamental to mod-
ern statistics that to credit Fisher with it sounds like attributing the introduction
of the wheel to Mr. So-and-So. Some combinatorial designs are so natural as to
be unavoidable andstill others, illustrated by the Knut Vik square, were familiar
in agronomy when Fisher began work in the field, but he inaugurated the sys-
tematic study of combinatorial designs (1934c; DOE § 35), and introduced the
main sophisticated categories of them (1926a 513; 1929d; DOE Ch. 7-8). The
analysis of variance and the analysis of covariance are his terms and, admitting
that everything has antecedents, presumably his inventions (1918 134, 424, 433;
1921b 110-11, 119-22; 1922c 600; 1923c 315-9; 1924b 810-13; RW Ch. 7-8;
DOEsee Index). Along with the analysis of variance goes the F-test—orz-test,
as Fisher would prefer.
The design of experiments suggests an interesting digression illustrating how
two great statisticians may movein entirely different worlds. Wald mentioned
in his Statistical Decision Functions (1950) that since choosing what experiment
to do is a decision, the theory of design is a topic under the general theory of
11 Savage may have chosen these words to avoid more specific terms and keep the meaning
general. As usual, the references are no clue as he did not supply them. ‘‘Confounding’’ or
‘‘partial confounding’? would also make sense, especially if the previous sentence is taken to
cover fractional factorial design. ‘‘Fractional replication’? would make less sense and was in-
vented by Finney, not Fisher.
ON REREADING R. A. FISHER 451
that book, and this remark of Wald’s was perhaps too ostentatiously repeated
in publicity for the book, such as the jacket. To Fisher (1955 70), this claim
was incomprehensible because Wald’s book does not ‘discuss elements of design
such as replication, control, and randomization.’
Fisher must have been thefirst to have that very broad vision of regression—
or the linear model— whichis one of the most fertile insights of modernstatis-
tics (1921 b; 1922c; RW §§ 25-9). In his day, the day of the desk calculator,
it was natural to emphasize shortcuts in the calculations associated with regres-
sion? (RW §§ 26-9), so it is natural that Fisher does not greatly emphasize the
study of residuals. Yet, he does sometimes study residuals, and I imagine that
he is an originator here too (1921 b 122-6; 1922a 322; 1924c 108-10; RW § 28.1
par. 1).
Fisher with Tippett (1928c) opened the field of the asymptotic distribution of
extreme values (Gumbel 1958 3). Watson and Galton (1874) are commonly
considered the fathers of the study of branching processes, but it was Fisher
who brought generating functions to bear on this topic and thereby put it on
the mathematical map”®(1930c).
Fisher invented and vigorously pursued k-statistics, unbiased estimators of
cumulants (1929a; 1930a; 1931a; 1937a). This seems strange for a man who
had no use for unbiasedness as a criterion in estimation (1915 520; 1935b 42;
SI 140), but I would not hasten to preclude that he had a reason perfectly con-
sistent with his philosophy.“ ‘Fisher helped work out the maximum likelihood
analysis of the probit model (1935d; ST) along with Bliss (1935; 1938) who
originated the name (1934). (The modelitself is old (Fechner 1860) and was
first used in biological assay by Gaddum (1933) and Bliss (1934; 1935); see
Finney (1952 § 14).)’
From two early controversies, Fisher has emerged completely victorious.
There used to be some confusion, and I infer, outright disagreement (described
by Yates and Mather 1963 101), about how to count the degrees of freedom in
a contingency table. !Fisher’s view (1922b; 1923b; 1924a; RW 8§§ 5, 20) has
prevailed over Karl Pearson’s.! Likewise, it was difficult to convince Karl
Pearson (1900; 1936) and presumably others, that moments mightbe inefficient
12 This also explains Fisher’s attention to grouping (1922a 317-21, 359-63; 1937 b 306-14; RW
see Index at ‘“‘grouping”’ and ‘‘Sheppard’s adjustment’’; DOE endof § 21).
13 One of Savage’s 3 x 5 cards indicates that he intended to check this, and he would surely
have changedit if he hadreceived, as I did, a letter from D. G. Kendall including the following
information. The basic problem andall three cases of the criticality theorem were stated by
Bienaymé (1845) in a paper rediscovered by Heyde and Seneta (1972). Watson did use generating
functions, but made anerrorin the supercritical case. In genetics, Haldane (1927) has a careful,
accurate statementof all three cases. For further history, see D. G. Kendall (1966, 1975).
14 The reason heusually gives is that using population cumulantsand their unbiased estimators
greatly simplifies the equations which connect the momentfunctionsof the sampling distributions
of moment statistics with the moment functions of the population (1929a 198 a, 203, 204, 237;
1930a 15a; 1937a 4-5).
452 LEONARD J. SAVAGE
15 Savage left this incomplete and may have changed his mind. Fisher (1925a) comestanta-
lizingly close, but Savage’s notes on §7 thereof say ‘“‘Cramér-Rao would be a big help but not
available.’’ His3 x 5 cards on Cramér-Raoagree with this and also say, “‘I was confusing with
Koopman-Pitman,”’ and “‘Late and little’? with a reference to SI 145 ff. Perhaps he merely in-
tended to mention the ‘‘rediscovery of a faint version’’ he mentions in par. 4 from the end of §3
below. Anothercard says that Fisher (1934 a) ‘‘scooped’’ Pitman-Koopman (on families of distri-
butions admitting sufficient statistics). It is conceivable that Savage intended to refer to this
instead, or even to use it to relate Fisher to the Cramér-Rao bound by wayof the fact that the
Pitman-Koopman families are the only regular ones achieving it. My contribution following
this paper sketches anotherrelationship.
ON REREADING R. A. FISHER 453
1925a 701; RW § 11; see also CMS Index and 1922a esp. 309-10, 316-7, 329-
31; RW §§ 1-3, 53-6). In its current meaning,! an arbitrary function of the
data, not necessarily real-valued, the concept is extremely familiar today and
recognized to be invaluable. The term for it is not necessarily the best imagi-
nable, though it by now seemsineradicable.
Estimates are of central importance to Fisher, but I doubt that he attempted
any precise definition of the concept. Perhaps we cansafely say that an estimate
is a statistic, especially a real-valued one, intended to estimate something. Some-
times in the writings of Fisher and otherstatisticians ‘“‘estimate”’ is seen from the
context to mean a sequence of estimates, one associated with each sample size
(1922a; 1925a; 1935b; RW 88§ 3, 53, 55, 56). When this is done, it is in a con-
text in which the asymptotic behavior of the sequence of estimates has the stage.
As Fisher came to feel, not only is this ellipsis excessive, but such asymptotic
considerations lack a uniformity of control necessary for practical conclusions
(SI 144-5). For example, if X, is asymptotically virtuous, then the sequence of
estimates that are identically zero for n < 10” and equal to ¥, elsewhere has
the same asymptotic virtues as X, and the samepractical lack of virtue as using
0 for the estimate regardless of the data.
By “estimation,” Fisher normally means whatis ordinarily called point esti-
mation (see CMS or CP Index at “estimation”; RW § 2, Ch. 9; DOE § 66; SI
Ch. 6). In particular, he does not refer to fiducial intervals as estimates (1935b
51). The term “point estimation” made Fisher nervous, because he associatedit
with estimation without regard for accuracy, which he regarded as ridiculous
and seemed to believe that some people advocated (1935b 79; SI 141); this
apprehension seems to me symptomatic of Fisher’s isolation from other modern
theoretical statisticians” (§ 2.4).
The idea and terminology of a sufficient statistic or a set of sufficient statistics
was introduced by Fisher in its current form (1920 768-9; 1922a 316-7, 331;
1925a 713-4; the latter two include factorization. See also Stigler 1973.). Whether
a sufficient statistic deserved the term used to be controversial but Fisher has
won hands down. I know of no disagreement that when an experiment admits
a given statistic as sufficient then observation of that statistic is tantamountfor
all purposes to observation ofall the data of the experiment.
Intimately associated with sufficient statistics is the concept of the likelihood
of an experiment depending on a parameter, possibly multidimensional. The
most fruitful, and for Fisher, the usual, definition of the likelihood associated
with an observation is the probability or density of the observation as a function
of the parameter, modulo a multiplicative constant; that is, the likelihood associ-
ated with an observation is the class of functions of the parameter proportional
16 Nevertheless, his objection (SI 140-1) to commoncriteria of point and interval estimation
because they lack invariance under single-valued transformationsis justified in the sense that
such criteria will draw distinctions among estimates which are equally informative in his sense
(see below) and commonsense.
454 LEONARD J. SAVAGE
to the probability of the observation given the parameter (1922a 310, 326-7,
331; 1925a 707; though not a probability, it may measure relative degree of
rational belief: 1930b 532; 1932 259; 1934a 287; SI 66-73, 126-31; see also
§ 4.4). The likelihood of independent observations is the product of the likeli-
hoods of each observation, and for this reason, it is often convenient to work
with the logarithm of the likelihood (SI 71, 148).
The likelihood is a minimal sufficient statistic. That is, the likelihood of the
data, regarded as a random object(in this case, a random function on parameter
space), is sufficient for the data, and the likelihood can be recovered from any
sufficient statistic. Fisher seems to have been the discoverer of this important
fact, and he was very appreciative of it (1925a 699 b; 1934a 287, 289, 294, 306;
SI 49-50, 151).
‘Usually consistent estimation is defined’ to mean a sequence of estimates, one
for each sample size, that converges in probability to the parameter beingesti-
mated. ‘Fisher gave a different definition, now usually called Fisher consistency.
(1922a 309, 316; 1924a 444; SI 142, 144. Anentry to current literature is Norden
1972-3.) He tended for some time to treat the usual definition as interchange-
able with his (1924a 444; 1925a 702-3; 1935b 41-2; RW 88 3, 53) but ultimately
rejected it (SI 144).’ A Fisher-consistent estimate is mathematically a functional
defined on distributions that coincides with the parameter to be estimated on the
family of distributions governed by the parameter. Employed as an estimate,
this functional is applied to the empirical distribution of a sample of n independ-
ent drawings from an unknowndistribution of the family.” Though motivation
can be seen for this definition, it has drawbacks. Many functions commonly
thought to be excellent estimates are not consistent under this definition because
they are not really functions of the empirical distribution. Certain favorite
estimates of Fisher’s such as the k-statistics (for k > 1) of which the ordinary
estimate of the variance is the most important are not, strictly speaking, func-
tions of the empirical distribution but of the empirical distribution and the sample
size. Fisher does not seem to have mentioned this and would undoubtedly regard
it as mathematical nitpicking.* On the other hand, I suspect that Fisher would
have seen it as an advantage of this definition that it rendered “inconsistent”’
certain examples (J. L. Hodges, Jr., see Le Cam 1953 280) that had been invented
17 Conceivably there are other ways of making Fisher’s definition precise, but to improve on
this one would be hard. Von Mises (1964 Ch. 12 and references therein) used the term ‘‘statis-
tical functions” and investigated their continuity, differentiability, laws of large numbers, and
asymptotic distributions. Savage planned to ‘“‘say somewhere why Fisher consistency tends to
promise ordinary consistency.’’ The reason is, of course, that the empirical distribution con-
verges (in probabilistic senses) to the true distribution, and hence, if a functional is appropriately
smooth, its empirical value will convergeto its true value.
18 Here in the manuscript Savage had a note to himself saying, ‘‘Quote him about the Nile at
least.”’ Fisher’s ‘‘problem of the Nile’’ (1936a 258, 244a; see also CP Index; SI 119, 163) is
equivalent to a previously formulated and partially solved problem of similar regions according
to Neyman (1961 148 footnote).
ON REREADING R. A. FISHER 455
4. Points of controversy.
55; DOE § 60; see also 1922a 316; 1925a 704; 1938 17) is that statistics losing
a fraction of the information lose that fraction of the work done to gather the
data. This seems basically correct to me, and it is not so intimately bound up
with variance as the measureof the inaccuracy of an estimate as might be thought
from my description so far.
From my ownpoint of view, the Fisher information is typically the reciprocal
of the variance of a normaldistribution which is a good approximation, in large
experiments, of the posterior distribution of the parameter (under almost any
prior). This asymptotic variance is an appropriate index of the width of the
posterior distribution for almost any practical loss function.
But Fisher insisted that to lose information was tantamount to losing a cor-
responding numberof observations even in small samples (1922.a 338-9, 350-1;
1925a 669a,709, 712, 714-22; 1934a 300; SI 152; see also below). Atfirst, he
seemed to expect this to speak for itself, but it met with doubt and even derision
(Bowley 1935) so Fisher eventually developed what he called a two-story argu-
ment to justify his nomenclature and idea. If a large number of small experi-
ments were done and the data from each replaced by somestatistic of smaller
information than the original experiment, then the many small experiments
taken together would constitute a large experiment with n times the information
of a component experiment and the n statistics taken together would constitute
a large experiment with a fraction, say a, of that information. This would
indeed represent a waste of (1 — a) x n of the n small experiments (1935b 41,
46-7; SI 157). As an argument for saying that an estimate in a given small
experiment wastes the fraction (1 — a) of the total information in that experi-
ment, I myself regard this more as a ploy than as a real move. (See also SI 159.)
It does give a practical counsel in case one is determined to summarize each of
a large number of small experiments in terms of one or few numbers, but this
is not likely to be an applicable model to somebody who proposes to use the
median, rather than the mean, of a sample of size 5 (of presumably guaranteed
normality). But the argument does at least deserve to be known, andI, for
one, foundit a surprise on rereading Fisher.
4.2. Properties of maximum likelihood estimation. Are maximum likelihood
estimates typically Fisher consistent? Fisher said they were (1935b 45-6; SI
148; see also 1922a 328 and references below on efficiency, which implicitly
requires consistency) and with considerable justification. Consider first, as
Fisher did, a multinomial experiment, that is a number of independenttrials
each of which ranges over the samefinite set of possible outcomes with the same
distribution depending on the unknown, possibly multivariate, parameter 6. As
Fisher emphasized, there is in principle no loss in generality in confiningall
discussions of statistics to experiments with finite numbers of outcomesand,
458 LEONARD J. SAVAGE
22 | haven’t found this point emphasized in Fisher’s writing. Sometimes he mentionsit (SI 50
and perhaps 143-4) or assumes without commentthat considering a finite numberof classes is
sufficiently general (1925a 700-1, 718; 1935 b 45; SI 142, 145). Eisenhart reports that in a 1951
conversation, Fisher said that he madethe point in 1922a, but all Eisenhart found was the second
sentence of §12. The only reference I found in Savage’s notesis 1924 a; all I see thereis that classes
are used, but they are required for chi square (and similarly in 1928 b).
ON REREADING R. A. FISHER 459
distributions (possibly taking on the value —oo) which would make Fisher
consistency of the maximum likelihood estimate true in very great generality.
Whether there is real use, or only a certain satisfaction of the sense of magic,
in knowing that maximum likelihood estimation can be said to be Fisher con-
sistent, I cannotsay.
Fisher of course expected maximumlikelihood estimates to be consistent in
the sense of convergence in probability also (see references on consistency in
§ 3 above). Certain kinds of exceptions he would have regarded as mathemat-
ical caviling. Indeed, this might be the case for any exceptions thus far known,
Tconceivably even Bahadur’s (1958).1 Knowing Fisher, I am not surprised at
my inability to find discussions of counterexamples, nor would I be surprised if
some discussion were turned up somewhere. A mathematically satisfactory
account of consistency in probability of maximum likelihood estimates has had
a painful evolution and may not yet be complete. (See for example Wald 1949,
Perlman 1972. Norden 1972-3 surveys various properties of maximum likeli-
hood estimates, with a few idiosyncratic, neo-Fisherian touches.)
In smooth andcivilized repeated trials, and many other kinds of large exper!-
ments, maximum likelihood estimation is not only consistent but efficient, that
is, the distribution of the maximum likelihood estimate is approximately normal
around @ with the variance of the approximating distribution being the reciprocal
of the Fisher information. (This does not mean that the variance of the estimate
itself is that small or even finite (Savage 1954 242-3). But that is not the sort
of distinction that I would expect Fisher to make or even countenance.) The
tendency of maximum likelihood estimates to be efficient was appreciated by
Edgeworth (1908-9) and later by Fisher (1922a 331-2, 367; 1922c 598; 1924a
445; 1925a 707, 710-11; 1932 260; 1935b 44-6; RW 8§3, 46, 55, 58; SI 148).
Neither succeeded in demonstrating the phenomenon with much generality from
a modern mathematical point of view, though Fisher went inestimably further
than Edgeworth. (See also § 2.4 and the end of § 3.)
Fisher asserted/conjectured that the maximum likelihood estimate alone among
Fisher-consistent estimates has any chance of being a sufficientstatistic [(1922a
331; 1925a 714; 1932 259), and atfirst that it is always sufficient (1922a 323,
330, 367), later that it is sufficient whenever there exists a sufficient estimate
(1922a 308a; 1935b 53, 82) or statistic (1922a 331; 1925a 714, 718; 1932 259;
RW § 3; SI. 151; he may mean a Fisher-consistent estimate in every case).! For
me, it is not the business of an estimate to be sufficient, so I regard the question
as somewhat distorted. Academic though the situation is, I have sought, and
offer below, a counterexample.
Fisher also conjectured that no other Fisher-consistent estimate (or perhaps
even more general kind of estimate) loses so little information as the maximum
likelihood estimate (1925a 699a, 720-1, 723; 1932 260; 1935b 53; 1936a 249,
250-1, 256; SI 157). This point too is academic but curiosity provoking. The
conjecture is false as stated, though there may be some wayto reconstrueit that
460 LEONARD J. SAVAGE
—‘ 9g) — 9—1)?
1+ 6 1—@
for —1 < 6 < 1, Twhere 9(1) is the total frequency of 1’s and g(—1) of —1’s.
(Thus g(k) = 2f(k, k) + f(k, —k) + f(0,k) + f(—k, k) for k = —1, 1.)! The
maximum likelihood estimate of @ is therefore
fy — 9) — 9(—1)
WD Gt) at)
whichis defined for all possible f with 1) f(i, /) =n and n > 0, andit lies in
the range of 6, [—1, 1]. (The sole purpose of the j-coin is to keep the denomi-
nator of 6(f) positive.)
The maximum likelihood estimate 6 is not sufficient (for ‘any n), because 6
can ‘be +1’ without determining ‘both g(1) and g(—1), which constitute a
minimalsufficient statistic. Therefore # loses some Fisher information.
For each probability distribution P defined on the domain off, namely i =
—1,0,1 andj = —1,1, let
S(P) = E(x**?4| P)
= Luis PG fn”? ,
where z is a transcendental numberlarger than 1. Then S(f/n) is a sufficient
Statistic for @, since f/n can be reconstructed from S(f/n) because of the tran-
scendentality of z.
Let
O(7) = S(P|9)
= E(n'"710)
= E(a"| @)E(x?" | 8)
= o(Lit + 8 Di ia'\(Di m4 + OY), jn).
On [—1, 1], Q is the product of two positive and strictly inceasing functions
of @, so the inverse function Q-! is well defined there.
ON REREADING R. A. FISHER 461
23 A page of Savage’s text is omitted here. It begins, ‘‘According to Fisher, the maximum
likelihood estimate is the only Fisher-consistent estimate that is determined by a linear equation
in the frequencies (19 ; ). There seems to be no truth in this at all---.’? Fisher states and
showsthis for estimates which are efficient, not merely consistent; that is, in current termi-
nology, he shows that the only efficient M-estimator is the maximum likelihood estimator
(1928 b 97-8; 1935 b 45-6; SI 148. Edgeworth did much the same: see my contribution following
this paper.). The nearest I have found to Savage’s version is one sentence (SI 157) where Fisher
doesn’t clearly impose either restriction and has just mentioned consistency but is concerned
with distinguishing among‘‘the different possible Efficient estimates’’ (SI 156). In view of this
context and the earlier references, I think Savage’s version is a misreading of Fisher which would
have been caughtbefore publication.
24 ““Actual’’ has been inserted, at the risk of misrepresenting frequentists, because in early
papers Fisher defines probability as a proportion in an “‘infinite hypothetical population”’ of
what seem to be repeated trials under the original conditions, where ‘‘the word infinite is to be
taken in its proper mathematical sense as denoting the limiting conditions approached byin-
creasing a finite numberindefinitely.”’ (1925 a 700; see also 1922.a 312.) Later he says, ‘‘An im-
agined process of sampling: --may be usedto illustrate---. Rather unsatisfactory attempts have
been madeto define the probability by reference to the supposed limit of such a random sampling
process---. The clarity of the subject has suffered from attempts to conceive of the ‘limit’ of
some physical process to be repeated indefinitely in time, instead of the ordinary mathematical
limit of an expression of which some elementis to be made increasingly great.”’ (SI 110.)
462 LEONARD J. SAVAGE
1939a 175; 1955 77; SI 2, 37, 106-10). For example, it might be expressed in
terms of the probabilities of events, as is surely appropriate in those cases where
there is an undisputed prior distribution (1922a 324; 1930b 530; 1932 257-8;
1934a 286; 1957 205; 1958a 272; S111, 17, 35, 111). This probability might
be fiducial probability. I shall try to say more aboutfiducial probability later
(§ 4.6). For the moment, I am content to explain that Fisher at first tried to
introduce a different kind of probability applicable in some cases in which
ordinary probability was not (1930b 532-5), but later came to hold that these
probabilities were ordinary probabilities, serving the purpose of posterior prob-
abilities in a Bayesian calculation, though arrived at by extra-Bayesian means
(SI S1, 56; 1960 5; see also § 4.6).
But the conclusion ofa statistical inference might be something other than a
probability (1930b 532; 1934a 284a; 1955 76-7; 1960 4; SI 35, 131-6, Ch. 3.).
For example, it might be a likelihood (1912 160; 1921 a 24; 1922a 326-7; 1925a
707; 1930b 532; 1932; 1935b 40-1, 53, 82; SI 66-73, 126-31; see also §§ 3 and
4.8). Because likelihoods are intimately associated with probabilities, it has
been suggested that the whole concept is superfluous (van Dantzig 1957 190).
Yet, a likelihood function of a parameter, which might rightly be called a set
of likelihood ratios, is evidently not a probability distribution for the parameter.
Thus we can see why one who, like Fisher, believes that a likelihood function
constitutes a statistical inference, would see here an example of a statistical
inference that is not expressed in terms of probabilities, more exactly, in terms
of a probability distribution of the unknown parameters.
Fisher often refers to exact tests (see § 4.7), so tests would seem to be for him
a form of exact non-Bayesian inference issuing in tail areas which are neither
likelihoods nor parameter distributions.
If nothing else can be said about induction, there will be general agreement
that induction differs from deduction in this. Anything that can be deduced
from part of the information at hand can be deduced from all of it, but in
induction account must be taken of all of the data. Fisher is very fond of this
point (1935b 54; 1935c¢ 392-3; 1936a 254-5; 1937c¢ 370; 1945b 129; 1955 75,
76; 1958a 268, 272; 1960 4, 10; SI55, 109) though he lapses a bit on atleast
one occasion.”
Fisher seemsto think the ignoring of pertinent information anessential feature
of Neyman-—Pearson statistics (1935c 393; 1955 76; SI 101; 1960 4, 7; see also
below). There is at least one rather clear case in point. It has been suggested
by Bartlett and followed up by Scheffé that to test whether two sets of n numbers
have the same mean, though possibly different variances, the elements of the
25 Anywhere that Fisher countenances the use of less than fully informativeorefficient statis-
tics could be considered an example, but presumably Savage had something more specific in
mind. Unfortunately, the only possible reference I found in his notes (1939a 175) doesn’t seem
to be it. Eisenhart thinks Savage is probably referring to Fisher’s use of orderstatistics as fidu-
cial limits for population percentiles (§ 2.6 above.).
464 LEONARD J. SAVAGE
two sets might be [randomly! paired and then the n differences be subjected to
a t-test. !(Bartlett never advocated this test in practice, and Scheffé, if he did,
does not now. See Bartlett 1965 and Scheffé 1970 for later views and earlier
references.)! “What,” Fisher once asked meorally, “would the proponent of
such test say if it turned out to be significant at the 99% point, but if his
assistant later discovered that hardly any pairing other than the one accidentally
chosen resulted in so extremea significance level?” (See also 1937c 375: RW
§ 24.1 Ex. 21; SI 96-9.) Choosing one amongthe manypossible pairings at
random and ignoring the results of those not examined but available for exami-
nation does constitute a sort of exclusion of pertinent evidence. However, there
seems to me to be very similar fault in all those applications of randomization
that Fisher so vigorously advocated. Whenever we choose a design or a sample
at random, we ordinarily are able to see what design or what sample we have
chosen, and it is not fully appropriate to analyze the data as though we lacked
this information, though Fisher in effect recommends that.
It should in fairness be mentioned that, when randomization leads to a bad-
looking experiment or sample, "Fisher said that] the experimenter should, with
discretion and judgment, put the sample aside and draw another. He speculated,
for example, that a more complicated theory might makeit possible to choose
Latin squares at random from amongthe acceptable Latin squares. A few refer-
ences harmonious with this point of view are® (Grundy and Healy 1950; Youden
1956-72; for further discussion and references, see Savage 1962 33-4, 88-9).
*° | first thought Savage intendedto refer to Fisher here, but I have found nothing, and Yates
and Mather (1963 112) say that Fisher never faced up to the problem. The nearest hint
I have
found in Savage’s notes is the comment ‘‘Chooses a Square at random but not quite,’’ refer-
ring to (1926a 510) where he has marked the following passage: ‘‘Consequently, the term
Latin
Square should only be applied to a process of randomization by which oneis selected at random
out of the total number of Latin Squarespossible; or at least, to specify the agricultural
require-
ment morestrictly, out of a number of Latin Squares in the aggregate, of which every
pair of
plots, not in the same row or column, belongs equally frequently to the same treatment.’
The
context gives no suggestion that what Fisher has in mind here is bad randomizations, and re-
stricting randomization to one standard square or one transformation set seems more likely
to
me. My impression of his writing generally is of a hard-line view. Indeed, in the same paper,
the last paragraph of the previous section and the last sentence of the following page (both also
noted by Savage), distinctly suggest this, though conceivably Fisher’s only concern in these
passagesis his frequent oneof systematic arrangements(see § 2.6). In a 1952 conversation, how-
ever, when Savage asked Fisher what he would do if he happened to draw a Knut Vik square
at random, Fisher ‘‘said he thought he would draw again andthat, ideally, a theory explicitly
excluding regular squares should be developed” (Savage 1962 88). Perhaps Fisher took a softer
line privately than he felt appropriate for public exposition.
ON REREADING R. A. FISHER 465
27 This relates to the ambiguity of ‘“‘does not depend”’ in thefirst (less restrictive) definition of
“ancillary” four paragraphs back. Specifically, we are interested in the parametersof the regres-
sion of x2 on x;, and in the residual variance, and if we choose y; and oa; as the nuisance parame-
ters, then the distribution of x, does not depend on the parametersofinterest, i.e., it depends
(continued on next page)
ON REREADING R. A. FISHER 469
only on the nuisance parameters. This is not the case, however, if we choose “; and a as the
nuisance parameters. The formerallows xX;, s; to be ancillary, the latter r (perhaps—see below).
The meaning of ‘‘does not depend on certain variables (or parameters)’’ depends on howthe
remaining variables (parameters) are chosen. A Bayesian definition would include the condition
that the distribution of ancillary statistics depends only on nuisance parameters which are also
judged to be a priori independent of the parameters of interest. This helps prevent a contradic-
tory multiplicity of ancillary statistics. Any definition should also require, I believe, that the
distribution of the observations given the ancillary statistics depend only on the parameters of
interest, i.e., that the ancillary statistics be sufficient for the nuisance parameters when the pa-
rameters of interest are known (Kendall and Stuart 1961 217). This holds trivially for the more
restrictive definition and may be implicit in Savage’s discussion. If it is required, then neither
xX; nor s nor r is individually ancillary, but x; and s, still are jointly. Though this requirement
obviously reduces multiplicity, it by no means resolves all problems of conditional inference.
As Savage (1962 20) says, ‘‘For a striking, if academic, example, suppose x and y are normal
about 0 with variance 1 and correlation ». Then x and y are each by themselves irrelevant to
p, and each is an ancillary statistic for the total observation (x, y) by any criterion known to
me.’’ See also Birnbaum (1962 esp. 279), Cox (1958 esp. 359-63), Dawid (1975), Hajek (1967 esp.
150-4).
470 LEONARD J. SAVAGE
importance (RW § 24.1 par. before Ex. 20; SI 93; see also 1935c¢ 395; 1939a 180;
1941c 149), but it vividly illustrates a difference in conclusion between Fisher
and frequentists of the Neyman-—Pearson school. The problem is to estimate the
difference between the means of two normaldistributions of not necessarily equal
variance. Thefiducial distribution of each meanis that of the sample meanplus
a t-like variable times the sample standard deviation and these two population
meansare fiducially independent. Therefore, their differenceis fiducially distri-
buted like the difference between the two sample meansplus a linear combina-
tion of two independent t-like variables (references eight paragraphs above).
The fiducial intervals thus adduced are known not to be confidence intervals,”
and they command norespect from adherents of the Neyman—Pearson school
(Bartlett 1965 § 3; Scheffé 1970 footnote 4). For Jeffreys, who accepts uniform
priors for the unknown means and for the logarithms of the variances, what
Fisher calls the fiducial distribution of the difference of the two meansis simply
its posterior distribution. Indeed, Jeffreys claims to have preceded !Fisher! in
the discovery of this answer, and apparently with justice.”
4.7. Hypothesis testing. Hypothesis testing was extremely importantto Fisher,
and his ideas aboutit do not coincide with those that are now most widely known
through the influence of the Neyman-—Pearson school. Let him speak for himself
28 This means (presumably) that there exist mm, mz, 01/02, and a for which the coverage prob-
ability of the fiducial intervals is less than 1 — a. Fisher writes as if this couldn’t happen (1939 a
173 a; 1960 8; SI 96). To demonstrate that it can, there is apparently only one published exam-
ple, one of three coverage probabilities calculated by E. L. Scott and given by Neyman (1941,
table near end of § 4; according to Scott, the headings should be corrected to read n = 7, n’ = 13,
and p2). This example is, however, in contradiction with a table of Wang (1971 Table 5) for the
same a (two-tailed .05) and degrees of freedom (6 and 12), where the error rate is given for vari-
ance ratios 1/32 to 32 by factors of 2, is everywhere < .05, and varies far too smoothly to be
compatible with Scott’s value (.066 at a variance ratio of 10. According to D. L. Wallace, who
drew my attention to her paper, Wang’s values are within .0002 except at a variance ratio of 32,
wherethe correct error rate is .0491, not .0499.). Furthermore, calculations by Geoffrey Robinson
(1976) show one-tailed error rates less than a fora = .1, .05, .025, .01, .005, .001, .0005, 0001; m1, nz =
2(1)8(2)14, 18, 24, 32, 50, 100, 00; and o12Me/o22m1 OF o2?m/o12n2 = 1, 1.5, 2, 3, 5, 10, 30, 100, 1000,
which he considers sufficient ‘‘to infer with reasonable certainty’’ that even the one-tailed pro-
cedure is conservative. Mehta and Srinivasan (1970) and Lee and Gurland (1975) found some
(one-tailed) error rates above a at variance ratios near 0 and oo for a second-order asymptotic
approximation to the fiducial procedure. Elsewhere their values are appreciably below a. The
fiducial procedure has error rate exactly a at ratios 0 and oo. This suggests that adjusting their
values to apply to the exactfiducial procedure would give error rates everywhere below a in the
cases they calculate also.
29 Savage left space for references after ‘‘claims’’ and ‘‘justice,’’ but no such claim has been
found, and it is absent from two notable discussions of the problem by Jeffreys (1939 §5.42;
1940). In the one-sample problem Jeffreys (1931 69; 1932) preceded and indeedinstigated Fisher’s
(1933 a) paper.
Incidentally, D. J. Finney notes that Fisher himself insisted on referring to the Behrensdistri-
bution and test and disliked ‘‘Behrens-Fisher,’’ and especially ‘‘Fisher-Behrens,’’ even asking
Finney to correct a misuse of his. Fisher’s writing is consistent with this, and he states (SI 94)
that his (1935 c) paper ‘‘confirmed and somewhat extended Behrens’ theory.”’
ON REREADING R. A. FISHER 471
as is the possibility of testing any hypothesis other than a sharpone, that is, one
that postulates a specific value for a parameter [or a function of parameters| (but
see also below and SI 46, 89-92). Apparently there have beenstatisticians who
recommended actually picking a level before an experiment and then rejecting
or not according as that level was obtained. I do not have the impression that
any professionalstatisticians make that recommendation today, thoughit is still
often heard among those who are supposedto be served bystatistics, but Fisher’s
strong rejection of the notion is noteworthy (SI 43; but compare 1926a 504,
DOE §§ 7, 61). Though the importance, or even the existence, of power func-
tions is sometimes! denied, Fisher says that sometests are ‘more sensitive’ than
others, and I cannot help suspecting that that comes to very muchthe same thing
as thinking about the power function. (DOE §§ 8, 11, 12, 61; SI 21, 42, 47-8;
see also RW § 2 footnote, § 18 Ex. 5, § 24 Ex. 19; 1926a 504; 1934a 294-6.
Fisher argues that failure to reject the null hypothesis does not establish it and
hence is not an “error”: 1935e; 1955 73; see also DOE §§ 8, 61.)
Thelogic of “something unusual” is very puzzling, because of course in almost
any experiment, whatever happens will have astronomically small probability
under any hypothesis. If, for example, we flipped a coin 100 times to investigate
whether the coin is fair, all sequences have the extremely small probability of
2-™if the coin is fair, so something unusual is bound to happen. Once when
J asked Fisher about this point in a small group, he said, “Savage, you can see
the wool you are trying to pull over our eyes. What makes youthink we can’t
see it too?” At any rate, the doctrine of “something unusual” does not work if
taken very literally, and this, of course, is why Fisher had recourse to tail areas,
grouping outcomes as more or less antagonistic to a given null hypothesis (DOE
§§ 7, 8; see also 1926a 504; 1936a 251-2; and references below).
For Fisher, it was very importantthat tests be “exact” (DOE § 17 par. 2; 1936a
251, 252; 1939a 174; 1939d 2, 5-6; see also § 4.4). For this, it would be enough
that they be exact given appropriate ancillaries, as I have illustrated in the dis-
cussion of regression. Often, “exact” seems to mean having a given size in the
Neyman-Pearson sense (1960 8; DOE §§ 7, 8, 61; SI 37-9, 87, 96; see also ref-
erences below). This, however, does not serve to explain Fisher’s use of the
Behrens—Fisher distribution in testing whether two meansare equalin the pres-
ence of possibly unequal variances (1935 397; 1939a; 1945b; 1960 8; 1961a;
SI 94-6; he argues that ‘“‘ repeated sampling from the same population” is mis-
leading and other reference sets are appropriate, without fully explaining his
reference set in the Behrens—Fisher problem, even in 196la, and in general
without fully reflecting the fact that the expectations of conditional levels are
unconditional levels: 1939a 173a—b; 1945b 130, 132; 1945c; 1955 70-2 but
compare 73; 1960 6-7; 1961a; SI 39-44, 75-103).
4.8. The likelihood principle. The likelihood principle is a doctrine that seems
in retrospect very appropriate to Fisher’s outlook, though it does not seem to
474 LEONARD J. SAVAGE
have been plainly stated by him until his last book (SI 70-1, 72-3, 136; see also
1932 259; 1934a 287; 1935b 40-1; 1936a 249). Indeed, the first formal state-
ment of the likelihood principle known to meis not by Fisher but by Barnard®
(1947, 1949). The principle is still controversial, but I believe that it will come
to be generally accepted. That the likelihood is a minimal sufficient statistic is
an objective technical fact (see § 3). That such a statistic is as useful as the
whole data for any statistical purpose is never really denied. (A seeming denial
sometimes arises whencritics point out that in practice specific statistical models
can never be wholly trusted so that a statistic sufficient on the hypothesis of a
given modelis not sufficient under the wider hypothesis that that model may not
actually obtain.) Thus, no one doubtsthat the likelihood function together with
a statement of the distribution of this function for each value of the unknown
parameter would constitute all that is relevant about an experiment bearing on
the parameter. The likelihood principle goes farther, however: it says that the
likelihood function for the datum that happens to occur is alone an adequate
description of an experiment without any statement of the probability that this
or another likelihood function would arise under various values of the pa-
rameter.
In a certain passage (SI 72-3), Fisher seems pretty forthrightly to advocate
the likelihood principle. It could be argued that he means to apply it only to
“statistical evidence of types too weak to supply true probability statements”’
(SI 70).
Fisher does sometimes depart from the likelihood principle. For example,
‘tail probabilities and hence significance tests do so.** More disturbingly, a
Poisson process admits a fiducial inference if the number of arrivals is fixed (SI
52-4) but not if the total time is fixed, despite identical likelihoods. This and!
‘another, similar example’? are given in (Anscombe 1957).
According to Fisher, when other devices, such as Bayes’ theorem and the
fiducial argument, are not available, the likelihood constitutes in itself an exact
statistical inference (see § 4.4). Late in his work (SI 71), Fisher suggests a sort
of test which is not a tail area test but consists simply in reporting the ratio of
the maximum ofthe likelihood under the null hypothesis to its maximum under
the alternate hypothesis.”
f
80 Barnard writes that Fisher’s statement (RW §2 last par., already in Ist ed.) is as formal.
Savage’s “‘random note”’ on Fisher (1936 a 249) says, ‘‘Full likelihood principle clear here if not
earlier,’ buta 3 x 5 card questions this. In earlier works Savage says that it was “‘first put for-
ward,”’ and ‘‘emphasized to statisticians’ by Barnard (1947) and Fisher (SI), and gives further
discussion (1962 17-20) and references (1954 2nd (1972) ed. iv), including one to Birnbaum (1962),
whocertainly gives a formal statement.
*! Savage noted that the likelihood principle is ‘in effect denied’? by Fisher (1922a 314 last
par. 2nd sentence).
82 Savage’s manuscript unfortunately breaks off here. He intended to add a section onerrors
and inconsistencies, but drafted no further text. What notes he left suggest that he intended to
(continued on next page)
ON REREADING R. A. FISHER 475
REFERENCES
ANSCOMBE, F. J. (1957). Dependenceofthe fiducial argument on the sampling rule. Biometrika
44 464-469.
BAHADUR, R. R. (1958). Examples of inconsistency of maximumlikelihood estimates. Sankhyda
20 207-210.
BARNARD, G. A. (1947). Review of Sequential Analysis by Abraham Wald. J. Amer. Statist.
Assoc. 42 658-664.
BARNARD, G. A. (1949). Statistical inference (with discussion). J. Roy. Statist. Soc. Ser. B 11
115-149.
BARNARD, G. A. (1963). Fisher’s contributions to mathematical statistics. J. Roy. Statist. Soc.
Ser. A 126 162-166.
BARTLETT, M. S. (1936). The information available in small samples. Proc. Cambridge Philos.
Soc. 32 560-566.
BARTLETT, M. S. (1939). Complete simultaneous fiducial distributions. Ann. Math. Statist. 10
129-138.
BARTLETT, M.S. (1965). R.A. Fisher and the last fifty years of statistical methodology. J.
Amer. Statist. Assoc. 60 395-409.
BARTLETT, M. S. (1968). Fisher, R. A. Internat. Encycl. Social Sci. 5 485-491. Crowell, Collier,
& Macmillan, New York.
BAYES, THOMAS(1763). An essay towards solving a problem in the doctrine of chances, with
Richard Price’s Foreward and Discussion. Philos. Trans. Roy. Soc. London Ser. A 53
370-418. Reprinted with a biographical note and some editing by G. A. Barnard in
Biometrika 45 293-315 (1958) and in Pearson and Kendall (1970).
BERNOULLI, DANIEL (1778). The most probable choice between several discrepant observations
and the formation therefrom of the most likely induction (with commentby L. Euler)
(in Latin). Acta Acad. Petrop. 3-33. English translation by C. G. Allen with an in-
troductory note by M. G. Kendall in Biometrika 48 1-18 (1961), reprinted in Pearson
and Kendall (1970).
BIENAYME, I. J. (1845). De la loi de multiplication et de la durée des familles. Soc. Philomathi-
que de Paris, Bull.: Extraits des proces verbaux des séances, Ser. 5 37-39.
BIRNBAUM, ALLAN (1962). On the foundationsof statistical inference (with discussion). J. Amer.
Statist. Assoc. 57 269-326.
Buiss, C. I. (1934). The method of probits. Science 79 38-39. Correction ibid. 409-410.
Buiss, C. I. (1935). The calculation of the dosage-mortality curve. Ann. Appl. Biol. 22 134-167.
The comparison of dosage-mortality data. Ibid. 307-333.
Buiss, C. I. (1938). The determination of dosage-mortality curves from small numbers. Quart.
J. Pharmacy & Pharmacology 11 192-216.
Bow -ey, A. L. (1928). F. Y. Edgeworth’s Contributions to Mathematical Statistics. Roy. Statist.
Soc., London.
Bow-ey, A. L. (1935). Discussion of Fisher (1935 b). J. Roy. Statist. Soc. 98 55-57.
CocHRAN, WILLIAM G. (1976). Early development of techniques in comparative experiments.
On the History of Statistics and Probability (D. B. Owen, ed.). Marcel Dekker, New
York.
Cox, D. R. (1958). Some problems connected with statistical inference. Ann. Math. Statist. 29
357-372.
CRAMER, HARALD (1946). Mathematical Methods of Statistics. Princeton Univ. Press.
add little beyond this. He ended his Fisher lecture, which was much more informal and some-
what more conjectural, and of course presented without documentation, with an invitation to
the audience:
I do hope that you won’t let a week go by without reading a little bit of Fisher.
I’m sure that someof the things I’ve told you are incredible. Check up on some
of those damnlies—it’I] do you good. And you'll enjoy it!
476 LEONARD J. SAVAGE
Dawip, A. P. (1975). On the concepts of sufficiency and ancillarity in the presence of nuisance
parameters. J. Roy. Statist. Soc. Ser. B 37 248-258.
Dempster, A. P. (1968). A generalization of Bayesian inference (with discussion). J. Roy. Statist.
Soc. Ser. B 30 205-247.
EDGEWortTH, F. Y. (1908-9). On the probable errors of frequency-constants. J. Roy. Statist.
Soc. 71 381-397, 499-512, 651-678. Addendum ibid. 72 81-90.
EDWARDS, A. W. F. (1974). The history of likelihood. Internat. Statist. Rev. 49 9-15.
EISENHART, CHURCHILL (1974). Karl Pearson. Dictionary of Scientific Biography 10 447-473.
Scribner’s, New York.
FECHNER, G. T. (1860). Elemente der Psychophysik (2 vols.). Breitkopf und Hartel, Leipzig.
FIELLER, E. C., Creasy, MONICA A. and Davip, S. T. (1954). Symposium on interval estima-
tion (with discussion). J. Roy. Statist. Soc. Ser. B 16 175-222.
FINNEY, D. J. (1952). Probit Analysis: A Statistical Treatment of the Sigmoid Response Curve.
Chapman and Hall, London.
FISHER, RONALD AYLMER. See separate list below.
FRASER, D. A. S. (1961). The fiducial method and invariance. Biometrika 48 261-280.
FRASER, D. A. S. (1968). The Structure of Inference. Wiley, New York.
GAppuM, J. H. (1933). Reports on biological standards. HII. Methods of biological assay de-
pending on a quantal response. Med. Res. Council Special Report Series no. 183.
GosseT, W.S. (‘‘Student,’’ q.v.) (1962). Letters from W. S. Gosset to R. A. Fisher: 1915-1936.
4 vols., plus a vol. of summaries by R. A. Fisher, with a foreward by L. McMullen.
2nd ed. (1970) in 1 vol. Privately circulated.
GRunpby, P.M. and HEALy, M. J. R. (1950). Restricted randomization and quasi-Latin squares.
J. Roy. Statist. Soc. Ser. B 12 286-291.
GumMBEL, E. J. (1958). Statistics of Extremes. Columbia Univ. Press.
HAJEK, JAROSLAV (1967). On basic concepts of statistics. Proc. Fifth Berkeley Symp. Math.
Statist. Prob. 1 139-162. Univ. of California Press.
HASEK, JAROSLAV (1972). Local asymptotic minimax and admissibility in estimation. Proc.
Sixth Berkeley Symp. Math. Statist. Prob. 1 175-194. Univ. of California Press.
HALDANE, J. B. S. (1927). A mathematical theory of natural and artificial selection, V. Proc.
Camridge Philos. Soc, 23 838-844.
Hrype, C. C. and Senera, E. (1972). Studies in the history of probability and statistics XXXI.
The simple branching process, a turning point test and a fundamental inequality: A
historical note on I. J. Bienaymé. Biometrika 59 680-683.
HILxi, Bruce M. (1963). The three-parameter lognormaldistribution and Bayesian analysis of
a point-source epidemic. J. Amer. Statist. Assoc. 58 72-84.
HOTELLING, HAROLD (1951). The impact of R. A. Fisher on statistics. J. Amer. Statist. Assoc.
46 35-46.
JEFFREYS, HAROLD (1931). Scientific Inference. Cambridge Univ. Press. (4th ed. 1973.)
JEFFREYS, HAROLD (1932). On the theory of errors and least squares. Proc. Roy. Soc. Ser. A 138
48-55.
JEFFREYS, HAROLD (1939). Theory of Probability. Oxford: Clarendon Press. (4th ed. 1967.)
JEFFREYS, HAROLD (1940). Note on the Behrens-Fisher formula. Ann. Eugenics 10 48-51.
KENDALL, D. G. (1966). Branching processes since 1873. J. London Math. Soc. 41 385-406.
KENDALL, Davivc G. (1975). The genealogy of genealogy branching processes before (andafter)
1873. Bull. London Math. Soc. 7 No. 3 225-253.
KENDALL, M. G. (1961). Studies in the history of probability and statistics. XI. Daniel Bernoulli
on maximumlikelihood. Biometrika 48 1-2. Reprinted in Pearson and Kendall (1970).
(See also Bernoulli (1778).)
KENDALL, M. G. (1963). Ronald Aylmer Fisher, 1890-1962. Biometrika 50 1-15. Reprinted in
Pearson and Kendall (1970).
KENDALL, M. G. (1973). Entropy, probability and information. Internat. Statist. Rev. 41 59-68.
KENDALL, M. G. and Stuart, A. (1961). The Advanced Theory of Statistics: Vol. 2, Inference
and Relationship. Charles Griffin, London.
ON REREADING R. A. FISHER 477
KULLBACK, SOLOMON (1959). Information Theory and Statistics. Wiley, New York.
KULLBACK, SOLOMON and LEIBLER, R. A. (1951). On information and sufficiency. Ann. Math.
Statist. 22 79-86.
LAMBERT, J. H. (1760). Photometria, Sive di Mensura et Gradibus Luminis Colorum et Umbrae.
Augustae Vindelicorum, Augsburg, 1760, art. 271-306.
Le Cam, LucIEN (1953). On some asymptotic properties of maximum likelihood estimates and
related Bayes estimates. Univ. of California Publ. in Statist. 1 277-329.
LEE, AUSTIN F. S. and GURLAND, JOHN (1975). Size and power of test for equality of meansof
two normal populations with unequal variances. J. Amer. Statist. Assoc. 70 933-941.
MAHALANOBIS, P. C. (1938). Professor Ronald Aylmer Fisher. Sankhyd 4 265-272. Reprinted
_in Fisher (CMS) and in Biometrics 20 238-250 (1964).
Menta, J. S. and SRINIVASAN, R. (1970).. On the Behrens-Fisher problem. Biometrika 57 649-655.
NEYMAN, JERZY (1934). On the two different aspects of the representative method: the method
of stratified sampling and the method of purposive selection (with discussion). J. Roy.
Statist. Soc. 97 558-625. Reprinted in Neyman (1967’).
NEYMAN, JERZY (1935). With K. Iwaszkiewicz and St. Kolodziejczyk. Statistical problems in
agricultural experimentation (with discussion). J. Roy. Statist. Soc. Suppl. 2 107-180.
Reprinted in Neyman (1967’).
NEYMAN, JERZY (1938). L’estimation statistique traitée comme un probléme classique de prob-
abilit€. Actualitées Scientifiques et Industrielles 739 25-57. Reprinted in Neyman
(1967’).
NEYMAN, JERZY (1941). Fiducial argumentand the theory of confidence intervals. Biometrika
32 128-150. Reprinted in Neyman (1967).
NEYMAN, JERZY (1951). Review of Contributions to Mathematical Statistics by R. A. Fisher.
Scientific Monthly 62 406-408.
NEYMAN, JERZY (1956). Note on an article by Sir Ronald Fisher. J. Roy. Statist. Soc. Ser. B18
288-294.
NEYMAN, JERZY (1961). Silver jubilee of my dispute with Fisher. J. Operations Res. Soc. Japan
3 145-154.
NEYMAN, JERZY (1967). R. A. Fisher (1890-1962): An appreciation. Science 156 1456-1460.
NEYMAN, JERZY (1967’). A Selection of Early Statistical Papers of J. Neyman. Univ. of Cali-
fornia Press.
NEYMAN, JERZy and SCOTT, ELIZABETHL. (1948). Consistent estimates based on partially con-
sistent observations. Econometrica 16 1-32.
Norpen, R. H. (1972-3). A survey of maximum likelihood estimation. Internat. Stat. Rev. 40
329-354, 41 39-58.
PEARSON, E. S. (1968). Studies in the history of probability and statistics. XX. Someearly
correspondence between W. S. Gosset, R. A. Fisher and Karl Pearson, with notes
and comments. Biometrika 55 445-457. Reprinted in Pearson and Kendall (1970).
PEARSON, E. S. (1974). Memories of the impact of Fisher’s work in the 1920’s. Internat. Statist.
Rev. 42 5-8.
PEARSON, E. S. and KENDALL, M. G., eds. (1970). Studies in the History of Statistics and Prob-
ability. Charles Griffin, London.
PEARSON, KARL (1900). On the criterion that a given system of deviations from the probablein
the case of a correlated system of variables is such that it can be reasonably supposed
to have arisen from random sampling. Philos. Mag. 50 157-175.
PEARSON, KARL (1936). Method of moments and method of maximum likelihood. Biometrika
28 34-59.
PEARSON, KARL and FILon, L. N. G. (1898). Mathematical contributions to the theory of evo-
lution. IV. On the probable errors of frequency constants and on the influence of
random selection on variation and correlation. Philos. Trans. Roy. Soc. London Ser.
A 191 229-311.
478 LEONARD J. SAVAGE
1912 Onan absolute criterion for fitting frequency curves. Messeng. Math. 41
155-160.
1913 Applications of vector analysis to geometry. Messeng. Math. 42 161-178.
1914 Some hopes of a eugenist. Eugen. Rev. 5 309-315.
1915 Frequency distribution of the values of the correlation coefficient in sam-
ples from an indefinitely large population. Biometrika 10 507-521.
1918 The correlation between relatives on the supposition of Mendelian inher-
itance. Trans. Roy. Soc. Edinburgh 52 399-433.
1920 A mathematical examination of the methods of determining the accuracy
of an observation by the mean error, and by the mean squareerror.
Monthly Notices Roy. Astronom. Soc. 80 758-770. Paper 2 in CMS.
1921a On the “probable error” of a coefficient of correlation deduced from a
small sample. Metron 1 (4) 3-32. Paper 1 in CMS, Author’s Note
only.
1921b Studies in crop variation. I. An examination of the yield of dressed
grain from Broadbalk. J. Agric. Sci. 11 107-135. Paper 3 in CMS.
1922a On the mathematical foundations of theoretical statistics. Philos. Trans.
Roy. Soc. London Ser. A 222 309-368. Paper 10 in CMS.
480 LEONARD J. SAVAGE
1929b Tests of significance in harmonic analysis. Proc. Roy. Soc. Ser. A 125
54-59. Paper 16 in CMS.
1929c The sieve of Eratosthenes. Math. Gaz. 14 564-566.
1929d Studies in crop variation. VI. Experiments on the response of the potato
to potash and nitrogen (with T. Eden). J. Agric. Sci. 19 201-213.
Paper 18 in CMS.
1929f Statistics and biological research. Nature 124 266-267.
1929g The evolution of dominance; reply to Professor Sewall Wright. Amer.
Natur. 63 553-556.
1929h Reviews of The Balance of Births and Deaths, vol. 1: Western and North-
ern Europe by R. R. Kuczynski and The Shadow of the World’s Future
by Sir G. H. Knibbs. Nature 123 357-8.
1930a The moments of the distribution for normal samples of measures of de-
parture from normality. Proc. Roy. Soc. Ser. A 130 16-28. Paper 21
in CMS.
1930b Inverse probability. Proc. Cambridge Philos. Soc. 26 528-535. Paper 22
in CMS.
1930c The distribution of gene ratios for rare mutations. Proc. Roy. Soc.
Edinburgh 50 205-220. Paper 19 in CMS.
193la The derivation of the pattern formulae of two-waypartitions from those
of simpler patterns (with J. Wishart). Proc. London Math. Soc. Ser. 2
33 195-208.
1931b The sampling error of estimated deviates, together with other illustra-
tions of the properties and applications of the integrals and derivatives
of the normal error function. Math. Tables 1 xxvi-xxxv. British
Assoc. Ady. Sci. Paper 23 in CMS.
1932 Inverse probability and the use of likelihood. Proc. Cambridge Philos. Soc.
28 257-261.
1933a The concepts of inverse probability and fiducial probability referring to
unknown parameters. Proc. Roy. Soc. Ser. A 139 343-348.
1933b Review of Tables for Statisticians and Biometricians ed. by Kar! Pearson.
Nature 131 893-4.
1934a Two new properties of mathematical likelihood. Proc. Roy. Soc. Ser. A
144 285-307. Paper 24 in CMS.
1934c The 6 x 6 Latin squares (with F. Yates). Proc. Cambridge Philos. Soc.
30 492-507.
1934d Randomisation, and an old enigma of card play. Math. Gaz. 18 294-297.
1935a The mathematical distributions used in the commontests of significance.
Econometrica 3 353-365. Paper 13 in CMS.
1935b The logic of inductive inference (with discussion). J. Roy. Statist. Soc.
98 39-82. Paper 26 in CMS.
1935c The fiducial argumentin statistical inference. Ann. Eugen. 6 391-398.
° Paper 25 in CMS.
482 LEONARD J. SAVAGE
1935d The case of zero survivors in probit assays. Ann. Appl. Biol. 22 164-165.
1935e Statistical tests. Nature 136 474.
1935f Discussion of Neyman (1935). J. Roy. Statist. Soc. Suppl. 2 154-157,
173.
1936a Uncertain inference. Proc. Amer. Acad. Arts Sci. 71 245-258. Paper 27
in CMS.
1936b The “coefficient of racial likeness” and the future of craniometry. J.
Roy. Anthropol. Inst. 66 57-63.
1936c A test of the supposed precision of systematic arrangements (with S.
Barbacki). Ann. Eugen. 7 189-193. Paper 28 in CMS.
1936d Has Mendel’s work been rediscovered? Ann. of Sci. 1 115-137.
1937a Moments and cumulants in the specification of distributions (with E. A.
Cornish). Rev. Inst. Internat. Statist. 5 307-322. Paper 30 in CMS.
1937b Professor Karl Pearson and the method of moments. Ann. Eugen. 7 303-
318. Paper 29 in CMS.
1937c¢ Ona point raised by M. S. Bartlett on fiducial probability. Ann. Eugen.
7 370-375.
1938 Presidential address, Indian statistical conference. Sankhya 4 14-17.
1939a The comparison of samples with possibly unequal variances. Ann. Eugen.
9 174-180. Paper 35 in CMS.
1939b The sampling distribution of somestatistics obtained from nonlinear
equations. Ann. Eugen. 9 238-249. Paper 36 in CMS.
1939c A note on fiducial inference. Ann. Math. Statist. 10 383-388.
1939d “Student”. Ann. Eugen. 9 1-9.
1940a On the similarity of the distributions found for the test of significance in
harmonic analysis, and in Stevens’s problem in geometrical proba-
bility. Ann. Eugen. 10 14-17. Paper 37 in CMS.
1940b The precision of discriminant functions. Ann. Eugen. 10 422-429. Paper
34 in CMS.
1941a The negative binomial distribution. Ann. Eugen. 11 182-187. Paper 38
in CMS.
1941b Theinterpretation of experimental four-fold tables. Science 94 210-211.
1941c The asymptotic approach to Behrens’s integral, with further tables for
the d test of significance. Ann. Eugen. 11 141-172.
1942a Thelikelihood solution of a problem in compounded probabilities. Ann.
Eugen. 11 306-307. Paper 42 in CMS.
1942b The theory of confounding in factorial experiments in relation to the
theory of groups. Ann. Eugen. 11 341-353. Paper 39 in CMS.
1942c Some combinatorial theorems and enumerations connected with the
numbers of diagonal types of a Latin square. Ann. Eugen. 11 395-
401. Paper 41 in CMS.
1943 Review of The Advanced Theory of Statistics, vol. 1 by Maurice G. Kendall.
Nature 152 431-432.
ON REREADING R. A. FISHER 483
DISCUSSION
B. EFRON
Stanford University
This paper makes me happy to be statistician. Savage, and John Pratt, give
us a virtuoso display of high grade unobtrusive scholarship. Fisher comes across
as a genius Of the first rank, perhaps the most original mathematical scientist of
the century. A difficult genius though, one in whom brilliance usually outdis-
tances clarity. Savage deservesspecial credit for his deft pulling together of the
variousstrings of Fisher’s thought. This paper will make rereading Fishereasier
for those of us raised in different statistical traditions.
My paper [1] is mainly devoted to understanding one of Fisher’s inspired
484 LEONARD J. SAVAGE
REFERENCES
[1] Erron, B. (1975). Defining the curvatureof a statistical problem (with applications to sec-
ond orderefficiency). Ann. Statist. 3 1189-1242.
[2] Stren, C. (1962). A remark on the likelihood principle. J. Roy. Statist. Soc. Ser. B 125 565-
573.
CHURCHILL EISENHART
National Bureau of Standards
Jimmie’s Fisher Memorial Lecture “On Rereading R. A. Fisher” wasthefinest
talk I ever heard on anyaspectof statistics. His presentation held me spellbound
throughoutits entirety, and manyfriends to whom I have mentioned this tell me
that they were equally entranced. Now that his wisdom andinsight on Fisher
has reachedthe printed page, I am sure that most of those who heardthe original
presentation and many others too will refer to it again and again in years to
come.
ON REREADING R. A. FISHER 485
BRUNO DE FINETTI
University of Rome
1. Disconcerting inconsistencies. It has been a great pleasure, for me, to
receive and read this paper: it seemed almostlike listening to the typical con-
versations of L.J.S. in which the subject under discussion becamesteadily broader
and deeper owing to series of little valuable discoveries (an example, a counter-
example, a singular case, an analogy, a possible extension or a paradoxical one)
that occurred to him, and wereintroduced into the discourse often following a
short pause and a long “Ooooh.-.-”. Of course, this pleasure was intimately
mingled with the painful remembrance that the possibility of renewing such
exciting meetings has been suddenly interrupted by his death.
Concerning R. A. Fisher in particular, I am indebted to Savagefor all of the
little understanding I have been able to attain about his viewpoint. Outside the
difficulty (for me) of Fisher’s English, I was disconcerted by the alternation, in
his writings, of assertions and remarks, some completely in agreement and some
completely in disagreement with each one of the possible viewpoints about sta-
tistics, and in particular with my own viewpoint.
My uneasiness about understanding Fisher’s position has, perhaps, been defi-
nitely removed only by this posthumous message from Savage, particularly by the
remarks about Fisher’s inconsistencies, explained (§§ 4.4—4.6, and elsewhere) by
the conflict between his convictions about the necessity for conclusionsin the
form of “posterior probabilities” of a Bayes-like kind, and his preconception
against admitting subjective prior probabilities, as well as his rejection (rightly)
of “conventional” ones (like the uniform distribution in Bayes’ own approach
for “repeated trials,” and similar choices, e.g. of “conjugate priors,” if done
merely ‘for mathematical convenience’’). It is but as an attempt—or a subter-
fuge—to escape such an inescapable dilemma, that he resorts to inventing an
undefined name like “fiducial probability” or to suggesting the use of “‘likeli-
hoods” as ersatz probabilities. This is, indeed, a wrong answerto the rightly
perceived “absolute necessity to the advanceof science”of attaining Bayes’ goal,
whose abandonment he regards as “rather like an acknowledgment of bank-
ruptcy” (§ 4.6).
Let me mention here a remark by L. J.S. (§ 4.6) concerning a more general
impoverishmentofstatistical thinking which occurs when the Bayesian outlook
is lost sight of: relevant circumstances are “sometimes overlooked by non-
Bayesians, because they do not have easy ways to express (them).” Their lan-
guage is too one-sided, hence poor and distorting.
general consistent view about probability and its use (as are the two opposed
ones, of objectivists following the Neyman-Pearson school and of subjectivists
like L.J. Savage).
One point has been especially surprising to me (as well as, it seems, to Savage
himself: see § 4.3). Fisher was not a frequentist in the usual sense, but sub-
scribed to a somewhatsophisticated andrefined version of the so-called “classical”
definition: the version where the set of N equally likely cases (M favorable, with
M/N = p) is supposed countless (N very large, practically infinite). The worst
absurdity of the frequentist “definition” is so avoided: in a succession of draw-
ings all sequences are possible (if p = 4, with equal probability) and thefre-
quency is no way obliged to tend to p, nor to any limit whatsoever. This view
is not contradictory, provided one avoidsreally infinite sets where it would be
meaningless to speak of a “ratio” p = M/N = oo/oo (maybe “‘denumerable”’/“de-
numerable”?). Of course, for a subjectivist (in particular, for myself) this view
is tenable only if p is previously adopted as the evaluation of the probability con-
cerned, and the N cases are chosen so as to be considered subjectively equally
likely.
The fundamental inconsistency (with many ramifications) is, however, the one
mentioned previously, and it deserves to be discussed in more detail and depth.
L.J.S. justifies his writing saying that: Those who have already read in Fisher
will agree that understanding him deeply is not easy, and they may be glad to hear
the views of another (§ 1.1). That was surely true for me in reading him, and I
am sure it would be so for me and manyothers in reading other opinionstoo.
L.J.S. says: The following pages consist largely of judgments and impressions:
nothing can convert these subjective reactions completely into objective facts...
(§ 1.2). But the same holds for every other, so that the comparison may improve
and probably approach such persuasions.
L.J.S. adds that, however: ---it has heen an invaluable discipline for me to
support each of them by specific citations with reasonable thoroughness (§ 1.2). Un-
fortunately, this seach was not completed before his death; some valuable work
has been devoted by John W. Pratt; it would be highly welcomedif statisticians
familiar with Fisher’s work could recall, find, and communicate additional
quotations.
Let us hope that this paper by L. J. S. about Fisher maygiverise to clarifying
discussions about the foundations and the applications ofstatistics: a field about
which very much has been said and will be said maybe for ever and ever, but
where we mayat any rate attain some progress by concentrating efforts on such
a wide butspecific range of questions.
D. A. S. FRASER
University of Toronto
We owe Professor Jimmie Savage deep appreciation for his thorough and
detailed review of R. A. Fisher’s statistical research and publications. And we
also owe Professor John W. Pratt substantial thanks for his painstaking job of
editing the original manuscript into final form for the Annals and assembling
the extensive references needed for the manuscript.
Certainly Professor Savage’s statistical viewpoint, the Bayesian viewpoint, is
very different from the R. A. Fisher veiwpoint. On occasions we are reminded
of this by parenthetical references in the review, and indeed Professor Savage
makes reference to ‘‘a somewhat personal review.” I feel that much additional
credit goes to Professor Savage for the waythe difference in viewpoint has not
affected the assessment of the many contributions made by R. A. Fisher.
In Section 4.8 Professor Savage discusses the likelihood principle and notes
that “..-it does not seem to have been plainly stated by him [Fisher] until his
last book [Statistical Methods and Scientific Inference].”” I have not had the im-
pression that Fisher’s writings supported the likelihood principle and indeed the
specific references made to his last book do not leave me with a feeling that
Fisher in any considered way supported the principle. The likelihood principle
was a prominent topic at statistics meetings around 1962, largely as a result of
Allan Birnbaum’s research interests. At that time it was reported that Fisher
had been asked concerning the likelihood principle, that Fisher had enquired
ON REREADING R. A. FISHER 489
whatthe likelihood principle was and then thoughtfully replied that he did not
support it. Perhaps others can comment more authoritatively on this.
Professor Savage remarksthat the likelihood “principle is still controversial,
but [he believes] that it will come to be generally accepted.” Certainly from
the Bayesian viewpoint there are no grounds for doubting the principle. And in
general we have lacked examples with any force against the principle. Professor
Mervyn Stone (1976) in a very recent paper discusses some Bayesian complica-
tions found with two examples. Thefirst of these is an elaboration on Problem
11 in Lehmann (1959 24). This example has strong implications beyond the
Bayesian viewpoint: it can be presented as a powerful example against the likeli-
hood principle. Readers of the review of R. A. Fisher’s work will want to
consider this example in Stone (1976).
Professor Savage refers to ‘““many doubts and few unequivocal answers” in
Fisher’s work. He also quotes Fisher: “I am still too often confronted by
problems: -- to which I cannot confidently offer a solution, ever to be tempted
to imply that finality has been reached..-.” I think that this has several impli-
cations concerning Fisher’s research and deserves further comment. One impor-
tant characteristic of Fisher was his ability to move into new areasofstatistics,
suggesting concepts and methods and deriving results. In a larger sense this
avoided premature crystalization and conceptualization and left the theory open
to modification and development. Often however he was taken at face value
on some technical issue and the issue pursued meticulously. For example, the
likelihood function as used in Fisher in contrast to the common and incorrect
definition in moststatistics texts. And the concept of sufficiency as used in Fisher
in contrast to the extensive mathematical analyses of sufficiency most of which
became superfluous with the general recognition around 1960 of the likelihood
function statistic, a recognition that was in fact in Fisher’s earliest papers on
sufficiency. In retrospect the openendedness of Fisher’s exploratory work de-
serves more positive than negative credits.
Professor Savage notes that “It would be more economical to list the few
statistical topics in which he displayed no interest than those in which he did.“
If we examine present daystatistics, the fruitful, basic, and scientifically useful
parts of present daystatistics, and then assess which concepts and methods were
proposed or developed by Fisher, we would obtain a clearer picture of the magni-
tude of his contributions. In some measure the concepts and methods mentioned
in the review do this. But an overview showsthat Fisher’s contributions con-
stitute the central material of present daystatistics.
REFERENCES
LEHMANN, E. L. (1959). Testing Statistical Hypotheses. Wiley, New York.
STONE, MERVYN (1976). Strong inconsistency from uniform priors (with discussion). J. Amer.
Statist. Assoc. 71 114-125.
490 LEONARD J. SAVAGE
V. P. GODAMBE
University of Waterloo
This paper by the late Professor Savage will, I believe, provide valuable sug-
gestions for rereading Fisher to many, as indeed it did to me. The paper does
touch upon most of the important aspects of Fisher’s work. The presentation
throughout is admirably “balanced” and “objective.” Of courseit is not possible
to aim at completeness in such a short paper. I would therefore restrict my
comments to a couple ofdetails.
At the end of Section 4.4, “randomisation” is discussed. Concerning this
topic I find the following two statements by Fisher difficult to reconcile. In his
earlier paper (1936b 58, 59) Fisher says: “The simplest way of understanding
quite rigorously, yet without mathematics, what the calculations of the test of
significance amountto, is to consider what would happen if our two hundred
actual measurements were written on cards, shuffled without regard to nation-
ality, and divided at random into two new groups ofa hundred each. --- Actually
the statistician does not carry out this very simple and very tedious process, but
his conclusions have nojustification beyond the fact that they agree with those
which could have been arrived at by this elementary method.” On the other
hand in Statistical Methods and Scientific Inference (SI 98) Fisher says, “---and
whereas planned randomisation (1935-1953) is widely recognized as essential
in the selection and allocation of experimental material, it has no useful part to
play in the formation of opinion, and consequently in the tests of significance
designed to aid the formation of opinion in the Natural Sciences.” [The refer-
ences in the quotation are to his Design of Experiments. |
Nowthe latter statement above is made by Fisher in relation to Bartlett’s test
and other tests which introduce a deliberate random element to arrive at conclu-
sions. Such randomisation, I agree with Professor Savage (last but one paragraph
of Section 4.4) “does constitute a sort of exclusion of pertinent evidence.” But
Savage further says: “However, there seems to meto be a very similar fault in
all those applications of randomisation that Fisher so vigorously advocated.”
At least the type of randomisation mentionedin the first quotation of Fisher in
the last paragraph seems to be free from this “fault.”’ Briefly, it seems to assert
that the significance level obtained on the basis of randomisation frequency
would be nearly the sameas that obtained on the assumption of normality, when
the assumption (model) is valid. A paper demonstrating this more elaborately
is due to Eden and Yates (1933). The randomisation here surely does not imply
“exclusion of any pertinent evidence.’ A restatement and extension of the above
logic underlying randomisation, in my opinion, is as follows. For testing some
hypothesis we construct a test-statistic which is appropriate with respect to the
underlying model or assumptions (e.g. independence, normality, etc.). Now
when the hypothesis is true and the modelis valid, the test-statistic has a specified
probability distribution. A suitable randomisation can generateforthe test-statistic
ON REREADING R. A. FISHER 491
REFERENCES
BHAPKAR, V. P. (1972). Ona measureof efficiency of an estimating equation. Sankhyd Ser. A
34 467-472.
EDEN, T. and YATES, F. (1933). On the validity of Fisher’s z-test. J. Agric. Sci. 23 6-16.
GopaMBE, V. P. (1960). An optimum property of regular maximum likelihood estimation. Ann.
Math. Statist. 31 1208-1211.
GopamBE, V. P. and THompson, M. E. (1971). Bayes, fiducial and frequency aspectsofstatis-
tical inference in regression analysis in survey-sampling (with discussion). J. Roy.
Statist. Soc. Ser. B 33 361-390.
492 LEONARD J. SAVAGE
f I. J. Goop
1. True history of science would depend on letters, lectures and other oral
communication as well as on publications, as in recent work on the early history
of quantum mechanics. The pretence that nothing exists if it is not publishedis
unfortunate, for it discourages people from talking about their work. Perhaps
one day the history ofstatistics in this century will be properly discussed, and
Savage’s essay is a substantial contribution to such a treatment.
Good mathematicians have always used scientific induction in their work; for
example, Gauss “discovered” the prime number theorem and quadratic reciprocity
and never proved the former. Polya uses probability only qualitatively in his
writings on plausible inference in pure mathematics. I think it can be regarded
as quantitative, though usually imprecise, and can be combined with the prin-
ciple of rationality (maximization of expected utility). It involves a modification
of the axioms of probability (Good, 1950 49), and a dynamic interpretation of
probability which is useful both for the philosphy of science (Good, 1968a,
1971a, 1973) and for computer chess (Good, 1967a, 1976). Dynamic partially
ordered probability resolves the controversies between Fisherian and Bayesian
statistics.
2.3. In 1951, I met Fisher in Cambridge, and he mentioned that he thought
the best contribution he was then likely to make to genetics would be to teach
it to Cambridge mathematical students, partly because he thought they were
exceptionally capable. He went on to say that most of his clients were not in
the same class. (See § 4.4 below.)
In a colloquium in Cambridge in November 1954, R. B. Braithwaite gave a
talk on the minimax method and its implications for moral philosophy. In the
discussion I said that the minimax method suffered from the disadvantage ofall
objectivistic methods, including those used in Fisherian statistics, namely that
they necessarily ignore information so as to achieve apparent objectivity. There-
upon Fisher rose furiously, with a white face, and asked me to address my com-
ments to the contents of the lecture. After the meeting he told Henry Daniels
I was an “upstart” though previously he had told Donald Michie that he liked
my 1950 book.
George Barnard told me a few years ago that Fisher was well aware ofhis
own tendencyto lose his temper, and that he regardedit as the bane ofhis life.
ON REREADING R. A. FISHER 493
I felt better disposed to Fisher after that, although 8. Vajda told me Fisher once
referred to an influential school of American statisticians as ““Americans with
foreign sounding names.”
2.4. Allowing for the cases mentioned by Savage, and for others, Fisher was
to various extents anticipated regarding likelihood, measure of amountofinfor-
mation, the use of generating functions in branching processes, and the analysis
of variance (by Lexis and others: see RW § 20 and Heiss, 1968), yet his contri-
bution was great. To be partly anticipated should detract little from solid con-
tributions.
WasFisher a Bayesian? See § 4.6 below.
2.5. Some faults in Fisher’s style were (i) ambiguous use of pronouns; (ii) the
annoying but comically incorrect use of the expression “dependent from”; (ill)
covering up. For example, in RW (7th ed. at least) he describes his “exacttest”’
for 2 by 2 contingencytables, and omits to mention that it assumes the marginal
totals convey no information about independence. Yet in his (1935b 48), he
states this assumption. (Whenhesays “If it be admitted,” he in effect is saying
that it is a matter of judgment.)
Fisher’s style might have been better if he had circulated his manuscripts for
suggestions as Savage did with his 1954 book.
2.6. Fisher was a Galton Professor in London, and an admirer of Galton. In
GT 252 Fisher says that Galton “was aware that among these [titled families]
the extinction of the title took place with surprising frequency,” but in that book
he refers only to Galton’s Hereditary Genius and not to Natural Inheritance where
Watson’s work appeared. So Fisher may not have been aware of Watson’s
work. The use of generating functions in branching processes was discovered
independently by Bienayme (1845), Watson (1873), Woodward (1948), and my-
self (1948), so there is no reason to suppose that it was too difficult for Fisher!
The notion of interactions in multidimensional contingency tables seems to
start with a personal communication from Fisherto Bartlett, regarding 2 x 2 x 2
tables, as acknowledged in Bartlett (1935). This notion, together perhaps with
the semi-Bayesian approach to two-dimensional tables of Good (1956, rejected
in 1953 in its original form), and Woolf (1955), was part of the prehistory of
the loglinear model. The interactions were given a further philosophical boost
when they were related to maximum entropy (Good, 1963).
The Wishart distribution should perhaps be called the Fisher—-Wishart distri-
bution, since the bivariate form was essentially due to Fisher (1915).
3. (i) Someone once said at a meeting of the RSS that the only sufficient
statistic is the complete set of observations, because no model is certain.
(ii) When the number of parameters is not merely large but infinite, as in
nonparametric estimation of a continuous density, ML is certainly inappropriate,
but maximum “penalized likelihood” makes good sense, where the penalty de-
pends on “roughness” (Good, 1971 b, Good and Gaskins, 1971, 1972). This can
494 LEONARD J. SAVAGE
REFERENCES
BARTLETT, M. S. (1935). Contingencytable interactions. J. Roy. Statist. Soc. Suppl. 2 248-252.
Goon, I. J. (1948). Private communication to M. S. Bartlett, June 5.
Goon, I. J. (1950). Probability and the Weighing of Evidence. Griffin, London.
Goon, I. J. (1956). On the estimation of small frequencies in contingencytables. J. Roy. Statist.
Soc. Ser. B18 113-124.
ON REREADING R. A. FISHER 495
Goon,I. J. (1963). Maximum entropy for hypothesis formulation, especially for multidimen-
sional contingency tables. Ann. Math. Statist. 34 911-934.
Goop, I. J. (1965). The Estimation of Probabilities: An Essay on Modern Bayesian Methods. MIT
Press.
Goop, I. J. (1967a). A five-year plan for automatic chess. Machine Intelligence II (E. Dale and
D. Michie, eds.; Oliver and Boyd, Edinburgh), 89-118.
Goon, I. J. (1967 b). A Bayesian significance test for multinomial distributions. J. Roy. Statist.
Soc. Ser. B29 399-431.
Goon, I. J. (1968 a). Corroboration, explanation, evolving probability, simplicity, and a sharp-
ened razor. British J. Philos. Sci. 19 123-143.
Goon, I. J. (1968 b). Utility of a distribution. Nature 219 1392.
Goon, I. J. (1969). What is the use of adistribution? Multivariate Analysis II (P. R. Krishnaiah,
ed.; Academic Press, New York), 183-203.
Goop, I. J. (1971a). The probabilistic explication of information, evidence, surprise, causality,
explanation, and utility. Foundations of Statistical Inference (V. P. Godambe and
D. A. Sprott, eds.; Holt, Rinehart and Winston, Toronto), 108-141.
Goon, I. J. (1971 b). Nonparametric roughness penalty for probability densities. Nature Physical
Science 229 29-30.
Goon, I. J. (1973), Explicativity, corroboration, and the relative odds of hypotheses. Synthese
30 39-73.
Goop, I. J. (1974). Random thoughts about randomness. In PSA 1972 (Boston Studies in the
Philosophy of Science; D. Reidel: Dordrecht), 117-135.
Goop, I. J. (1976). Dynamic probability, computer chess, and the measurementof knowledge.
In Machine Representations ofKnowledge (E. W. Elcock and D. Michie, eds.), D. Reidel,
Dordrecht.
Goop, I. J. and Crook, J. F. (1974). The Bayes/non-Bayes compromise and the multinomial
distribution. J. Amer. Statist. Assoc. 69 711-720.
Goop, I. J. and Gaskins, R. A. (1971). Nonparametric roughness penalties for probability den-
sities. Biometrika 58 255-277.
Goon, IJ. J. and Gaskins, R. A. (1972). Global nonparametric estimation of probability densi-
ties. Virginia J. of Sci. 23 171-193.
Heiss, K.-P. (1968). Lexis, Wilhelm. International Encyclopedia of the Social Sciences 9 271-276.
WoopwarbD, P. M. (1948). A statistical theory of cascade multiplication. Proc. Cambridge
Philos. Soc. 44 404-412.
WooLF, BARNET (1955). On estimating the relation between blood groups and disease. Ann.
Human Genetics 19 251-253.
O. KEMPTHORNE
Towa State University
The Fisher Memorial Lecture of L. J. Savage wasthefinest statistical lecture
I have heard in my whole life. There is no suggestion or requirementthat Fisher
lectures should be addressed to Fisher’s own work, and this lecture was a sur-
prise for many. It consisted of a review of the main thrusts of Fisher’s life done
with deep respect and reflected a tremendouseffort to understand and place
Fisher’s contributions in the history of statistical ideas. It is tragic that Savage
could not complete his oral presentation for publication. Thestatistical profes-
sion is deeply indebted to J. W. Pratt for a remarkable effort.
496 LEONARD J. SAVAGE
As Savage said, to read the whole of Fisher’s work with some semblance of
partial understanding is a huge task, one on which many have spent years. I
surmise that the breadth and depth of that work will not be adequately appreciated
for decades. I suggest that no individual of this century contributed fundamen-
tally to so wide a variety of areas. I have always been impressed by Fisher’s
ability as a working mathematician. It seems that Fisher could tackle success-
fully and with deep originality almost any problem involving classical analysis,
numerical analysis, probability calculus, or combinatorics. I regard him as a
mathematician of the very highest order particularly in the dimensionof creativity.
Curiously enough, it seems that Fisher did nothing on strong asymptotic laws.
Fisher’s ability in distribution theory was surely remarkable in the context of
the times, and almostall of the distributional theory he worked out has become
part of the intermediate knowledge of mathematicalstatistics. Savage communi-
cated in his presentation the marvel of this effort. Fisher also became deeply
fascinated by any combinatoric problem, and his work on experimental designs
and in mathematical genetics in this direction boggles the mind. Fisher was
highly original in multivariate analysis.
One aspect of Fisher’s work which was touched on only briefly by Savage
was Fisher’s genetical effort. This would involve a lecture of similar dimensions
to the present one. It is noteworthy, however, that Fisher wasalso the first to
attack discrete stochastic processes by meansof diffusion approximations via the
Fokker-Planck differential equation (even though the first effort contained a
foolish mistake).
The mysteries of Fisher’s thought arise as soon as one turns away from the
purely mathematical work which has stood the test of time except for a small
number of minorerrors.
It seems quite clear that Fisher never succeeded in communicating to anyone
his idea of the nature of probability in spite of many efforts. I now find his 1956
book (SI) almost a total mystery. Fisher really did think that one could develop
by logical reasoning a probability distribution for one’s knowledge of a physical
constant. It is clear, I think, that Fisher did not support any idea of belief prob-
abilities of the type that Savage himself developed and presented so forcefully.
The fiducial argument was to lead to some sort of logical probability which
Fisher claimed could be verified, though he never gave an understandable idea
of what he meant by verification, and specifically excluded the possibility of
repeated measurements of the unknownconstant.
Savage alluded, appropriately, to obscurity on what Fisher meant by “esti-
mation.” My guess is that he meant the replacement of the data by a scalar
statistic T for the scalar parameter 6 which contained as much aspossible of the
(Fisherian) information on @ in the data. But what one should do with an
obtained 7 was not clear, though Fisher was obviously not averse at times to
regarding T as an estimator of @. It is interesting, as Savage noted, that Fisher
was the first to formulate the idea of exponential families in this connection.
ON REREADING R. A. FISHER 497
Here, also, the fascinating question of ancillaries arises, and on this Fisher was
most obscure. To some extent Fisher must be regarded as the initiator ofesti-
mation as a decision theory process, even though other writings suggest that he
found this view offensive. I imagine that withoutfiducial inference Fisher would
have found his views incoherent.
The work of Fisher abounds in curiosities. One which has struck me forcibly
is the absence of any discussion of the relationship of Fisher’s ideas on experi-
mentation (DOE)to his general ideas on inference (SI). The latter book contains
no discussion of ideas of randomization (except for the irrelevant topic oftest
randomization) which made DOEsointeresting and compelling to investigators
in noisy experimental sciences. Can the ideas on randomization and on parame-
tric likelihood theory be fused intoa coherent whole? I think not. In DOE Fisher
convincesus of the desirability of randomization and unbiased (over randomiza-
tions) estimation of error, but then proceedsto the so-called analysis of covari-
ance in which the unbiased estimation of error cannot be achieved.
I note that Savage applauded Fisher on factorial design, examining the rele-
vant experimental factors simultaneously. But the prescriptions of Fisher work
well only if interaction is small and lack of interaction is rare. With interaction,
Fisher’s analyses of variance lose muchof their force. Fisher did not appreciate
the role of nonadditivity and this came out in the 1935 Neymandiscussion.
Savage discussed Fisher’s ideas onstatistical tests and was not able to obtain
a coherent picture of Fisher’s approach. It is important that the obscurities be
recognized. Clearly Fisher regarded statistical tests as evidential in nature, but
to say this is, perhaps, merely to replace one obscure idea by another no less
obscure.
As regardslikelihood, the origins in Fisher’s own writing are quite obscure.
In the early days it was a tool for point estimation but later it was elevated toa
principle, again with deep mystery.
On fiducial inference, Fisher’s early writings had a superficial transparency
which convinced many of its correctness, and was thought to be the answer to
the age-old problem of induction. But, obviously, Fisher was unable to convey
his ideas to anyone, and, further, Fisher did not attach weight to the fact that
fiducial calculations were possible only in a very limited set of conditions, quite
inadequate for the broad purposesof science.
The upshotof all this can only be feelings of wonderment and puzzlement
which Savage conveyed effectively, with respect, openness, and a highly sincere
attempt to understand.
Will the statistical profession ever reach the status of nearly absolute accept-
ance or rejection of any of Fisher’s ideas on inference? Or is the profession to
retain forever a psychosis of not understanding Fisher and suspecting thatit is
stupid on that account ?
The profession will be grateful for the indefinite future that L. J. Savage made
such a fine effort to help.
498 LEONARD J. SAVAGE
STEPHEN M. STIGLER
University of Wisconsin
REFERENCES
BRAVAIS, A. (1846). Analyse mathématique sur les probabilités des erreurs de situation d’un
point. Mémoires Acad. Sci. Paris, par Divers Savans 9 255-332.
EpGewortH, F. Y. (1921). Molecularstatistics. J. Roy. Statist. Soc. 84 71-89.
500 LEONARD J. SAVAGE
SHEPPARD, W.F. (1899). On the application of the theory of error to cases of normal distribu-
tion and normal correlation. Philos. Trans. Roy. Soc. London Ser. A 192 101-167.
STIGLER, S. M. (1973). Laplace, Fisher, and the discovery of the concept of sufficiency. Bio-
metrika 60 439-445.
STIGLER, S. M. (1975). The transition from point to distribution estimation. 40th Session of
the I.S.I., Warsaw, Poland.
* K K K K K K K K K K