Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

1967-STATISTICAL METHODS-George W.snedecor, William G.cochran

Download as pdf or txt
Download as pdf or txt
You are on page 1of 609

STATISTICAL METHODS

*: *: *: *: *: *:

LIBRARY
enitdl )"Jtantaticm ::3"pJ }q"UtCh.
ilnjlitutt.llz/cH ,{ S' ,6c1,
CALlCUT·673"J11.
~T~TISTICAL
. .

GEORGE W. SNEDECOR
Prifeuor Emeritus if Statistics
and Former Director, Statistical Laboratory
Iowa State University

II OXFORD & IBH PUBLISHING CO.


Sixth Edition

WILLIAM G. COCHRAN
Prifessor if, Statistics
Harvard U"ivenity

C'alcutl>! Bombs.y New Delhi


Gi!OJt.GE W. SNEDECOIl is professor emeritus of statistics, Iowa State University. where he
taught from 1913 to 1958 and where he was for fourteen years director of the statistical
laboratory. His writings include a body of scientific journal articles, research bulletins,
and books, including Correlation and Machine Calculatic)11 (with H. A. Wallace), Calculation
andlnrerpretation oj Analysts o/Varionceand Covoriance, and Statistical Methods. He holds
a master of science degree from the University of Michigan, and honorary doctor of science
degrteS from North Carolina State University and Jowa State University. He is a member
of the International SLltistical Institute, past president of the American Statistical Associa~
tion, and an honorary Fellow of the British Royal Statistical Society. He has served also as
consultant. Human Factors Division. U.S. Navy Electronics Laboratory, San Diego, Cali~
fornia, where he now lives.

WILLIAM G. Coc.'HRAN is professor of statistics, Harvard University. He has served fonnerly


on the faculties of Johns Hopkins University, North Carolina State University, and Iowa
State University. He holds master of arts degrees from Glasgow University and Cambridge
University and an honorary master of arts degree from Harvard University. He is past
president of the American Statistical Association, the Institute of Mathematical Statistics.
and (he Biometric Society. His writings include many research papers in the professional
journals of his field; Sampling Techniques. 2nd cd., 1963; and Experimental Designs (with
Gertrude M. Cox), 2nd ed., 1957.

r-
@1937, 1938, 1940, 1946! 1956,,1967 Thf Iowa $ta" University
Ames, lowa, U.S.A. All rights reserved
i ., .' l
'3
Pr~"111 .
1
_

12.7
Sixth Edition, 1967

lndian Edition 1968 published by arrangement with the


originttl American publi8hers T!J~ lowa Sttzte University Press, U.S.A.
Second Indian Reprint, 1975 '.
31 \
Rs.20.00

For Sale in India, Pakutan, Burma. Ceylon and Indonesia

This book has been published on the paper supplied through


the Govt. oj I "dia at concessional rate

Published by Oxford &; IBIf Publishing Co. 66 J anpath, New Delhi I


and printed at Skylark Printerr, New Delhi 55
Preface

In preparing the sixth edition we have kept in mind the two purposes
this book has served during the past thirty years. Prior editions have been
used extensively both as texts for introductory courses in statistics and as
reference sources of statistical techniques helpful to research workers in
the interpretation of their data.
As a text, the book contains ample material for a course extending
throughout the academic year. For a one-term course, a suggested list
of topics is given on the page preceding the Table of Contents. As in
past editions, the mathematical level required involves little more than
elementary algebra. Dependence on mathematical symbols has been
kept to a minimum. We realize, however, that it is hard for the reader to
use a formula with full confidence until he has been given proof of the
formula or its derivation. Consequently, we have tried to help the reader's
understanding of important formulas either by giving an algebraic proof
where this is feasible or by explaining on common-sense grounds the roles
pl~yed by different p~rts of (It~ formul".
This edition retains also one of the characteristic features of the
book-the extensive use of experimental sampling to familiarize the reader
with the basic sampling distributions that underlie modern statistical
practice. Indeed. with the advent of electronic computers, experimental
sampling in its own right has become much more widely recognized as a
research weapon for solving problems beyond the current skills of the
mathematician.
Some changes have been made in the structure of the chapters, mainly
al the suggeslion ofleachers who have used Ihe book as a lext. The former
chapter 8 (Large Sample Methods) has disappeared, the retained material
being placed in earlier chaplers. The new chapler 8 opens wilh an intro-
duction to probability, followed by the binomial and Poisson distributions
(formerly in chapter 16). The discussion of mUltiple regression (chapter
13) now precedes that of covariance and multiple covariance (chapter 1'1).
v
vi Preface
Chapter 16 contains two related topics, the analysi. of two-way classifica-
tions with unequal numbers of observations in the sub-classes and the
analysis of proportions in two-way classifications. The first of these
topics was formerly at the end of a long chapter on factorial arrangements;
the second topic is new in this edition. This change seemed advisable for
two reasons. During the past twenty years there has been a marked in-
crease in observational studies in the social sciences, in medicine and public
health, and in operations research. In their a'nalyses, these studies often
involve the handling of multiple classifications which present complexities
appropriate to the later sections of the book.
Finally, in response to almost unanimous requests, the statistical
tables in the book have been placed in an Appendix.
A number of topics appear for the first time in this edition. As in
past editions, the selection of topics was based on our judgment as to
those likely to be most useful. In addition to the new material on the
analysis of proportions in chapter 16, other new topics are as follows:
• The analysis of data recorded in scales having only a small number
of distinct values (section 5.8);
• In linear regression, the prediction of the independent variable
X from the dependent variable y, sometimes called linear calibration
(section 6.14);
• Linear regression when X is subject to error (section 6.17);
• The comparison of two correlated estimates of variance (section
7.12);
• An introduction to probability (section 8.2);
• The analysis of proportions in ordered classifications (section
9.10);
• Testing a linear trend in proportions (section 9.11);
• The analysis ofa set of2 x 2 contingency tables (section 9.14);
• More extensive discussion of the effects of failures in the assump-
tions of the analysis of variance and of remedial measures (sections 11.10-
11.13);
• Recent work on the selection of variates for prediction in multiple
regression (section 13.13);
• The discriminant function (sections 13.14, 13.15):
• The general method of fitting non-linear regression equations and
its application to asymptotic1:egression (sections 15.7-15.8).
Where considerations of space permitted only a brief introduction
to the topic, references were given to more complete accounts.
Most of the numerical illustrations continue to be from biological
investigations. In adding new material, both in the text and in the exam-
ples to be worked by the student, we have made efforts to broaden the
range of fields represented by data. One of the most exhilarating features
of statistical techniques is the extent to which they are found to apply in
widely different fields of investigation.
High-speed electronic computers are rapidly becoming available as
a routine resource in centers in which a substantial amount of data are
analyzed. Flexible standard programs remove the drudgery of computa-
tion. They give the investigator vastly increased power to fit a variety of
mathematical models to his data; to look at the data from different points
of view; and to obtain many subsidiary results that aid the interpretation.
In several universities their use in the teaching of introductory courses in
statistics is being tried. and this use is sure to increase.
We believe, however, that in the future it will be just as necessary
that the investigator learn the standard techniques of analysis and under-
stand their meaning as it was in the desk machine age. In one respect.
computers may change the relation of the investigator to his data in an
unfortunate way. When calculations are handed to a programmer who
translates them into the language understood by the computer. the investi-
gator, on seeing the printed results, may lack the self-assurance to query
or detect errors that arose because the programmer did not fully under-
stand what was wanted or because the program had not been correctly de-
bugged. When data are being programmed it is often wise to include a
similar example from this or another standard book as a check that the
desired calculations are being done correctly.
For their generous. permission to re'Prtnt tables we are i.ndebted to
the late Sir Ronald Fisher and his publishers, Oliver and Boyd; to Maxine
Merrington. Catherine M. Thompson. Joyce It May, E. Lord. and E. S.
Pearson. whose work was published in Biometrika; to C. I. Bliss. E. L.
Crow. C. White, and the late F. Wilcoxon; and to Bernard Ostle and his
publishers, The Iowa State University Press. Thanks are due also to the
many investigators who made data available to us as illustrative exam-
ples. and to teachers who gave helpful advice arising from their experience
in using prior editions as a text. The work of preparing this edition was
greatly assisted by a contract between the Office of Naval Research.
Navy Department, and the Department of Statistics, Harvard University.
Finally. we wish to thank Marianne Blackwell. Nancy Larson. James
DeGracie and Richard Mensing for typing or proofreading. and especially
Holly Lasewicz for her help at many stages of the work. including the
preparation of the Indexes.
George W. Snedecor
William G. Cochran
A SHORT COURSE IN THE elEMENTS OF
STATISTICAL METHOD

CHAPTER 'AGES

I Attributes •••••••••••••••••••.••••••••••••••••••••• 3- 31
2 Measurements .•..•......... .....•........••....•.. 32- 61
66- 74
3 Sampling distributions . ............................. . {77- 79
4 Comparison of two samples .. ..••..•.....••..•••.••.• 91-104
5 Non·Parametric Methods . ........................... . 120-128
6 Regression ............. .......•••••.•••••••••••••• '35-'45
{149-157
7 Coff~\'3t:.on ••.••••••..•.•..•.•.•••.•••••••••••••••• \ 71'-\ 77
8 Binomial distribution .•• .............................. 199-219
9 One·way classifications-Attributes .............. ..... . 228-231
{236-238
10 One-way classifications-Measuremenh ... .. " ......... . 258-271
11 Two-way classifications . ............................ . 299-310
Lble of contents

Chapter I. Somp1in9 of "'ttribut'"


1.1 Introduction............ ................................. 3
1.2 Purpose of this chapter ................ , . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 The twin problems of sampling. .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.4 A sample of (ami facts. Point and interval estimates. . . 5
1.5 Random sampling.. . ....... ,..................................... 10
1.6 Tables of random digits... ......... .......... 12
I. 7 Confidence interval: verification of theory.. . . . . . . . . .. ............. 14
1.8 The sampled population.. . . .......... 15
V' 1.9 The frequency distri~utio" and its graphical representation, " .. , . , . . . . . . 16
1.10 Hypotheses about populations. . . ............. ........ ... 20
.....,. 1.11 Chi-square, an index of dispersion. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20
1.12 The formula for chi-square. . . ......... .... 21 . /
1.13 An experiment in sampling chi-square; the sampling distribution ....... '_/'21'
1.14 Comparison with the theoretical distribution ....................... -. -:: 25
1.15 The test of a null hypothesis or test of significance. . . . . . . . . . . . . . . . . . . 26
1.16 Tests of significance in practice......... . . ..... ...... .......... 28
1.17 Summary of technical tenUS ....... , ...... , . . . . . . . . . . . . . . . . . . . . . . . . . 29

Criaprer r. Samp,;ng r'rom a J\bnnaol~' 6IsmirurlRl' r'bpu,bttOrr


2.1 Normally distributed population. ................... 32
2.2 Reasons for the use of the normal distribution.. ..... .... 35
2.3 Tables of the normal distribution. ............ .......... 35
2.4 Estimators of jJ. and <1. . • • . , •• , • • • • • 39
2.5 The array and its graphical representation. .......... 40
2.6 Algebraic notation. . .,......... 41
2.7 Deviations from sample mean. . . .. .. ....... ... 42
2.8 Another estimator of a; the sample standard deviation. . . . . . . . . . . 44
2.9 Comparison of the two estimators of a. '.. ...... . 46
2.10 Hints on the computation of s . . . .. ' ..... , . . . . . . . . . 47
2.11 The stand,,-rd deviation of sample means. . . . . . 49
2.12 The frequency distribution of sample means. ;1
2.13 Confidenc~ intervals for J.I. when a is known. 56
2.14 Size of sample. ;8
2. t 5 "Student's" ,-distribution. . 59
2.16 Confidence limits for Ii based on the (·distribution. 61
2.17 Relative variation. Coefficient of vanation. 62
ix
l< Contents
Chapter 3. Experimental Sampling From a Normal Population
3.1 Introduction. 66
3.2 A finite population simulating the normal. . . . . . . . . . . . . . . 66
3.3 Random samples from a normal distribution. .............. 69
3.4 The distribution of sample means ................................. ··· 70
3.5 Sampling distribution of S2 and s. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . 72
3.6 Interval estimates of (52 • • • • • • • • • • ; • • • • • • • • • • . • • • • • • • • • • • • • • • • • • • • • • 14
I 3.7 Test of a null hypothesis value of tIl. . . . . • . . . . . . . . . . . . . .. . . . . . . 76
3.8 The distribution of t.. ...... 77
),9 The interval estimate of 11: the confidence interval..... . 78
3.10 Use of frequency distributions for computing X and s.· 80
3.11 Computation of X and s in large samples: example.. 81
3.12 Testsofnonnality...... ..................... 84
3.13 A test ofskewnesli.. ....................... &6
3.14 Tests for kurtosis. . . . ..... ...... ...... .. .......... ..... .. .. 86
3.15 Effects of skewness and kurtosis ... , . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

Chapter 4. The Comparison of Two Samples


4.1 Estimates and tests of differences .. 91
4.2 A simulated paired experiment. ........... . 92
4.3 Example of a paired ex.periment. .. , .... . 94
4.4 Conditions for pairing ... 97
4.5 Tests of other null hypotheses about IJ ... 97
4.6 Comparison of the means of two independent samples. 100
4.7 The variance ofa difference.... '" . :~~?~.:.::: 100
4.8 A pooled estimate of variance ......... , ...~.;--; . ------. 101
4.9 An experiment companng two groups of ~J size .. i02
4.10 Groups of unequal sizes ............. :: ...-....... . 104
4.11 Paired versus independent group'" 106
4.12 Precautions against bias.-randnmiz<llion. 109
4.13 Sample size in comparatlve experiments. III
4.14 Analysis of independent samples when crt f; ()2' 11.4
4.15 A test of the eqll.ality of two variances .. 116

Chapter 5. Shortcut and 'Non~parametr1c Methods


5.1 Introduction. 120
5.2 The t-test based on range. 120
5.3 Median, percentiles. and order statistics. 123
5.4 The sign test.. 125
5.5 Non-parametric methods; ranking of differences between measurements. 128
5.6 Non-parametric methods: ranking for unpaired mea.surements. 130
5.7 Comparison of rank and normal tests .. 132
5.R Sca!e~ with limited \alues 132

Chapter 6. Regression
6. J inlrodUC[lon. 135
0.2 The r .... gre~sion of blood pressure on age. 135
0 ..' Shortcul mcthod~ of computation in regre<;sion .. 139
0.4 The tlla~heJTIatical model in linear reercssion. 141
6.5 i? as an CS1H11;\lor of J.1 = :l -j- f3x. ~ 144
0.6 The estimator o( ~n:" 145
xi
6.7 The method of least squares . 147
6.8 The value of b in some simple cases ... 147
6.9 The situation when X varies from sample to sample ...... . 149
6.10 Interval estimates of P and tests of null hypotheses .. 153
6.11 Prediction of the population regression line. 153
6.1 Z Prediction of an individual Y. , . 155
6.13 Testing a deviation that looks suspiciouslY large ..... . 157
6.14 Prediction of X from Y. Linear calibration.. . ............ . 159
6.15 Partitioning the sum of squares of the dependent variate. . . .......... . 166
6:16 Galton's use of the term "regression".. . ..................... . 164
6. J 7 Regression when X is subject to error. . ..................... . 164
6.18 fitting a straight line through the origin. . ..... ; ............... . 166
6.19 The estimation of ratios.. . , .............. " 170
6.20 Summary.. . ........................ . 170

Chapter 7~ CQrrelatiort
7.\ Introduction. 172
7.2 The sample correlation coefficient r .. 173
7.3 Relation between the sample coefficients of correlation and regression. 175
7.4 The bivariate normal distribution. . ........ . 177
7.5 Sampling variation of the correlation coefficient. Common elements .... . 181
7.6 Testing the nuil hypothesis p = O. 184
7.7 Confidence limits and tests of hypotheses about p .. 185
7.8 Practical utility of correlation and regression ... 188
7.9 Variances of sums and differences of correlated variabLes. . ....... . 190
7.10 the calculation of r in a large sample. 191
7.11 Non-parametric methods. Rank correlation. 193
7.12 The comparison of two correlated variances. 195

Chapter 8. Sampling From the Sinomial Distribution


8.1 Introduction. 199
S.2 Some simple rules of probability. 199
8.3 The binomial distribution. 202
8.4 Sampling the blDom.iaJ distribution. . . ............. . 205
8.5 Mean and standard deviation of the binomial distribution ........... . 207
8.6 The normal approximation and the correction for continuity ........... . 209
8.7 Confidence limits for a proportion. . "" ........ . 210
8.8 Test of significance of a binomial propoftion. 211
H.9 The comparison of proportions in paired samples .. ~13
lUG Comparison of proportions in two independent samples: the 2 x 2 table. 215
8.1 I Te:>t of the independence of two attributes.,. 219
R.ll A test by means of the normal deviate z . ...... . 220

J 8.14
iU) Sample size for comparing two proportions.
The Poisson distribution.
221
223

Chapter 9. Attr'bute Ooto With More Than One Oegree of Freedom


4.1 Introduction. 228
.
9.2 Single classifications with more than two classes.
9.3 Single classifications with equal expectations.
228
231
j 9.4 Additional tests ..
fJ.S The J! test when the expectations are small.
233
235
9.6 Single classifications with estimated expectations ..... 236
9.7 Two-way c1as-slfications. The 2 x C contingency table .. 238
xii Cont.nts
9.8 The variance test for homogeneity of the binomial distribution........... 240
9.9 Further examination of the data..................................... 242
9.10 Ordered classifications............................................. 243
9.11 Test for a linear trend in proportions. . . . . . . . . . . . . . . . . . . . 246
9.12 Heterogeneity X! in testing Mendelian ratios.......................... 248
9.13 The R x C table. . . .. ....... ... ............................. 250
9.14 Sets of2 x 2 tables................................................ 253

Chapter 10. One-Way Classifications. Analysis of Variance


10.1 Extension from two samples to many. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. 258
10.2 An experiment with four samples... ...... .................. 258
10.3 The analysis of variance. . . . . . ........ ...... ... 260
10.4 Effect of differences between the population means................... 264
1'0.5 The variance ratio, F. . .......... . .... .... 265
10.6 Analysis of variance with only two classes. . . ..... 267
10.7 Comparisons among class means.. ............. 268
10.8 Inspection of all djfferences between pairs of means. 271
10.9 Shortcut computation using ranges. 215
10.10 Modell. Fixed treatment effects. 275
10.11 Effects of errors in the assumptions. . 276
10.12 Samples of unequal sizes. 277
10.13 Model n. Random effects. 279
10.14 Structure of model II illustrated by sampling. . ........... 282
10.15 Confidence limits for (1/. ................... 284
10.16 Samples within samples. Nested classifications.................... 285
10.17 Samples within samples. Mixed model. ....... .. 288
10.18 Samples ofunequaJ sizes. Random effects. 289
10.19 Samples within samples. Unequal sizes. 291
10.20 lntraclass correlation. . 294
10.21 Tests of homogeneity of l,o·adance.. 296

Chapter 11. Two-Way Classifications


11.1 Introduction. 299
11.2 An experiment with two criteria of classification .. 299
11.3 Comparisons among means .. 301
11.4 Algebraic notation .. 302
11.5 Mathematical model for a two-way classification. JOJ
11.6 Partitioning the treatments sum of squares .. 308
1J.7 Efficiency of blocking. 311
11.8 Latin squares. 312
11.9 Missing data. 317
11.10 Non+conformity to model. 321
11.11 Gross errors: rejection of extreme observations .. 321
II 12 Lack of independence in the errors. 323
1113 Unequal error variances duc to treatmenf.~ .. 324
I 1.14 Non-normality. Variance-:.lahilizing transformations .. 325
11.15 Square-root transformation for count~. 325
11.16 Arcsin transfor nation for proportions. 327
1l.1? The logarithmic transformatIOn. 329.
11.1H ~on-additivity. 330
11.19 T uke)' \ test of additivity. 331
11.20 Non-additivity in a Latin square. 3J4
Chapter 12. Factorial Experiments
12.1 Introduction.............................. . . . . . . . . . . . . . . . . . . . . . . 339
12.2 The single factor versus the factorial approach. . . .................... 339
12.3 Analysis of the 22 factorial experiment.............................. 342
12.4 The 22 factorial when interaction is present. ......................... 344
I2.S The general two·factor experiment.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 346
12.6 Response curves.......... ........................ 349
12.7 Response curves in two-factor experiments........................... 352
12.8 example of a response surface ......... , . . . . . . . . . . . . . . . . . . . . . . . . . . . 354
12.9 Three-factor experiments; the 23 . . • •••••••• . . ••••••••••••••••.••••• 359
12.10 Three-factor experiments; a 2 x 3 x 4. . . . ........... ...... ...... .... 361
12.11. Expected values of mean squares. . ........... ........... . ..... 364
12.12 The split·plot or nested design. . . .................................. 369
12.13 Series of .xperiments. .................................. 375
12.14 Experiments with perennial crops............................... 377

Chapter 1 3. Multiple Regression


13.1 Introduction................ . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381
13.2 Two independent variables.... ...................... 381
13.3 The deviations mean square and the F·test........................... 38S
13.4 Alternative method of calculation. The inverse matrix. ............... 389
J3.5 Standard errol'S of estimates in multiple regression. . . . . . . . . . . . . . . . . . . . 391
13.6 The interpretation of regression coefficients. . . . . . . . . . . . . . . . . . . . . . . . . . 393
13.7' Relative importance of different X·variables. . . . . . . . . . . . . . . . . . . . . . . . . . 398
n.8 Partial and multiple correlation. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 400
13.9 Three or more independent variables. Computations................. 40-3
13.10 Numerical example. Computing the b's. . .................. ~ _.;-. 40S
13.11 Numerical example. Computing the inverse matrix ............. :..... 409
13.12 Deletion of an independent variable................................ 412
13.13 Selection of variates for prediction.................................. 412
13.14 The discriminant function......................................... 414
13.IS Numerical example oftbe discriminaDt fuoction ........ ,..... 416

Chapter 1.... Analysis of Covariance


14.1 hitcoduction ................................................... . 419
14.2 Cova~ance in a completely randomized experiment .................. . 420
J 14.3
14.4
TM F·test of the adjusted means. . . . . . .. . ....................... . 424
Covariance in a 2·way classification. " ............................. . 425
14.5 Interpretation of adjusted means in covariance ..................... . 429
14.6 Comparison of regression lines. . ........... . 432
14.7 Comparison of the "Between Classes" and the "Within Classes" regres·
sions .......................................................... . 436
14.8 Multiple covariarK:C .......................................... . 438
14.9 Multiple covariance in a 2·way table .................... , .......... . 443

Chapter 1 S. Curvilinear Regreni""


I S.I Introduction.. . ................................... , . . . . . 447
IS.2 The exponential growth curve ................................. :.... 449
IS.3 The second degree polynomial. .. .............................. 4S3
IS.4 l>.lla having several y's at each X value. . . . . . . . . . . . 4S6
IS.S Test of departure fcom linear regression in covariance analysis ....... ,.. 460
xiv Contents

15.6 Orthogonal polynomials. . . ............ . 460


15.7 A general method of fitting non·linear regressions .. . 465
15.8 Fitting an asymptotic regression ..... . 467

Chapter 16. Two-Way Classifications With Unequal Numbers and


Proportions
16.1 Introduction. 472
16.2 Unweighted analysis of cell means. 475
16.3 Equal numbers within rows ... 477
16.4 Proportional sub-class numbers. 478
16.5 Disproportionate numbers. The 2 x 2 table .. 483
16.6 Disproportionate numbers. The R x 2 table .. 484
16.7 The R x C table. Least squares analysis .. 488
16.8 The analysis of proportions in 2-way tables ... 493
16.9 Analysis in thep scale: a 2 x 2 table. 495
16.10 Analysis in the p scale; a 3 x 2 wble. 49.
16.11 Analysis of logits in an R x C table .. 497
16.12 Numerical example ... 498

Chapter 17. Design and Analysis of Sampling


17.J Populations. 504
17.2 A simple example. 505
17.3 Probability sampling. 508
17.4 Listing the population .. 509
17.5 Simple random sampling. 511
17.6 Size of sam pie. 516
17.7 Systematic sampling ... . 519
17.8 Stratified sampling .. . 520
17.9 Choice of sample sizes in the individual strata. 523
17.10 Stratified sampling for attributes .. 526
17.11 Sampling in two stages .............. . 528
17.12 The allocation of resources in two-stage samp.Jing. 531
17.13 Selection"with probability proportional to size. 53.
17.14 Ratio and regression estimates. 536
17.15 Further reading ... 538

Appendix
List of Appendix Tables and Nott's. 541
Appendix Tables ... 543
Author Index. . . ........•... , .... , .... 577
Jndex 10 Numerical Examples .. 581
Subject Index. 585
STATISTICAL METHODS
i::r i::r i::r i::r i::r i::r
SamPling of attributes
l.t-Introduction. The subject matter of the field of statistics has
been described in various ways. According to one definition, statistics
deals with techniques for collecting, analyzing, and drawing conclusions
from data. This description helps to explain why an introduction to sta-
tistical methods is useful to students who are preparing themselves for a
career in one of the sciences and to persons working in any branch of
knowledge in which much quanti~ative research is carried out. Such re-
search is largely concerned with gathering and summarizing observations
or measurements made by planned experiments, by questionnaire surveys,
by the records of a sample of cases of a particular kind, or by combing
past published work on some problem. From these summaries, the in-
vestigator draws conclusions that he hopes will have broad validity.
The same intellectual activity is involved in much other work of im-
portance. Samples are extensively used in keeping a continuous watch on
the output of production lines in industry, in obtaining national and
regional estimates of crop yields and of business and employment condi-
tions, in the auditing of financial statements, in checking for the possible
adulteration of foods, in gauging public opinion and voter preferences, in
.\M-:niW.Im.., ...,.,ll,tlu- .JllIhlil'.i.'.in(mnwl J)I' J',mu:nI.i.<;.,,""I .arulJ\I\J)n
Acquaintance with the main ideas in statistical methodology is also
an appropriate part of a general education. In newspapers, books, tele-
vision, radio, and speeches we are all continuously exposed to statements
that draw general conclusions: for instance, that the cost of living rose by
0.3% in the last month, that the smoking of cigarettes is injurious to health,
that users of "Blank's" toothpaste have 23% fewer cavities, that a tele-
vision program had 18.6 million viewers. When an inference of this kind
is of interest to us, it is helpful to be able (0 form our own judgment about
(he truth of the statement. Statistics has no magic formula for doing this
in all situations, for much remains to be learned about the problem of
making sound inferences. But the basic ideas in statistics assist us in
thinking clearly about the problem, provide some guidance about the
conditions that must be satisfied if sound inferences are to be made, and
enable us to detect many inferences that have no good logical foundation.
3
.. Cbopter I: $ampliJllI of AttriI>uf••
1.2--Purpooe of this chapter. Since statistics deals with the collection,
analysis, and interpretation of data, a book on the subject might be ex-
pected to open with a discussion of methods for collecting data. Instead,
we shall begin with a simple and common type of data already collected,
the replies to a question given by a sample of the farmers in a county, and
discuss the problem of making a statement from this sample that will
apply to all farmers in the county. We begin with this problem of making
inferences beyond the data because the type of inference that we are try-
ing to make governs the way in which the data must be collected. In
earlier days, and to some extent today also. many workers did not appre-
ciate this fact. It was a common experience for statisticians to be ap-
proached with: Here are my results. What do they show? Too often the
data were incapable of showing anything that would have been of interest
to an investigator. because the method of collecting the data failed to
meet the conditions needed for making reliable inferences beyond the
data.
In this chapter, some of the principal tools used in statistics for mak-
ing inferences will be presented by means of simple illustrations. The
mathematical basis of these tools. which lies in the theory of probability,
will not be discussed until later. Consequently, do not expect to obtain a
full understanding of the techniques at this stage, and do not worry if the
ideas seem at first unfamiliar. Later chapters will give you further study
of the properties of these techniques and enhance your skill in applying
them to a broad range of problems.
1.3-The twin problems of sampling. A sample consists of a small
collection from some larger aggregate about which we wish information.
The sample is examined and the facts about it learned, Based on these
facts, the problem is to make correct inferences about the aggregate or
papulation. It is the sample that we observe, but it is the population which
we seek to know. .
This would be no problem were it not for ever-present variation. If
all individuals were alike, a sample consisting of a single one would give
complete information about the population. Fortunately, there is end-
less variety among individuals as well as their environments. A conse-
quence is that successive samples are usually different. Clearly, the facts
observed in a sample cannot be taken as facts about the population. Our
job then is to reach appropriate conclusions about the population despite
sampling variation.
. But I)ot every sample contains information about the population
sampled. Suppose the objective of an experimental sampling is to de-
termine the growth rate in a population of young mice fed a new diet. Ten
of the animals are put in a cage for the experiment. But the cage gets
located in a cold draught or in a dark corner. Or an unnoticed infection
spreads among the mice in the cage. If such things happen, the growth
rate in the sample may give no worthwhile information about that in the
population of normal mice. Again, suppose an interviewer in an opinion
5
poll picks only families among his friends whom he thinks it will be pleas-
ant to visit. His sample may not at all represent the opinions of the popula-
tion. This brings us to a second problem: to collect the sample in such a
w~y that the sought-for information is contained in it.
So we are confronted with the twin problems of the investigator: to
design and conduct his sampling so that it shall be representative of the
population; then, having studied the sample, to make correct inferences
about the sampled population.
1.4--A sample offarm facts. Point and interval estimates. In 1950
the USDA Division of Cereal and Forage Insect Investigations, cooperat·
ing with the Iowa Agricultural Experiment Station, conducted an exten-
sive sampling in Boone County, Iowa, to learn about the interrelation of
factors affecting control of the European corn borer.· One objective
ofthe project was to determine the extent of spraying or dusting for control
of the insects. To this end a random sample of 100 farmers were inter-
viewed; 23 of them said they applied the treatment to their corn fields .
.Such are the facts of the sample.
What i'!!erences can be made about the population of 2,300 Boone
County farmers? There are two of them. The first is described as a point
eslimate, while the second is called an interval estimate.
I. The point estimate of the fraction of farmers who sprayed is 23%,
the same as the sample ratio; that is, an estimated 23% of Boone County
farmers sprayed their corn fields in 1950. This may be looked upon as an
average of the numbers of farmers per hundred who sprayed. From the
a
actual count of sprayers in single hundred farmers it is inferred that the
average number of sprayers in all possible samples of 100 is 23 .
. This sample-to1'opulation inference is usually taken for granted.
Most people pass without a thought from the sample fact to this inference
about the population. Logically, the two concepts are distinct. It is wise
to examine the procedure of the sampling before attributing to the popu-
fation the percentage reported in a sample.
2. An interml estimate ofth. point is made by use of table 1.4.1. [n
the first part of the table. incticated by·95% in the heading, look across the
top line to the sample size of 100, then down the left-hand column to the
number (or frequency) observed. 23 farmers. At the intersection of the
column and line you will find the figures 15 and 32. The meaning is this:
one may be confident that the true percentage in the sampled population
lies in the interval from 15~; to 32~o' This interval estimate is called the
confident'e interrol. The nature of our confidence will be explained later.
In summary: based on a random sample. we said first that our esti-
mate oflhe percentage of sprayers in Boone County was 23~o' but we gave
no indication of the amount by which the estimate might be in error. Next
we asserted confidently that the true percentage was not farther from our
point estimate. 23°",. than 8 percentage points below or 9 above.
Let us illustrate these concepts in another fashion. Imagine a bin
• Data furni-shl!d courtesy of Dr. T. A. Br-indley.
6 CIoapt.r I: Samplin" of Allribuf••
TABLE 1.4.!
9S% CoNFIDENCE INTU.vAl (Pa CENT) FOR. BINOMIAL DlSTtlIBUTION (1)*

Number Sizt of Saml)le. " Fraction Size of Sample


Observed Observed
f 10 IS 20 30 50 100 fl. 250 1000

0 0 27 0 20 0 IS 0 10 0 07 0 4 0..00 0. I 0 0
I 0 40 0 31 0 23 0 17 0 II 0. 5 .01 0 4 0 2
2 3 61 2 37 I 30 I 21 0 14 0. 7 .0.2 I 5 I 3
3 8 62 5 45 4 36 2 25 I 17 I 8 ,03 I 6 2 4
4 15 74 9 56 7 42 4 30 2 19 \ 10 .04 2 7 3 5
5 22 78 14 64 10 47 6 33 3 22 2 II ,0.5 3 9 4 7
6 26 85 19 67 14 54 9 37 5 24 2 12 .06 3 10 5 8
7 38 92 19 71 14 59 to 41 6 27 3 14 ,07 4 II 6 9
8 39 97 29 81 20. 65 13 44 7 29 4 15 ,0.8 5 12 ~ to
9 60100 33 81 22 71 16 48 9 31 4 16 ,09 6 13 7 II
10 73 100 36 86 29 71 17 53 to 34 5 18 ,10. 7 14 8 12
II 44 91 29 78 20. 56 12 36 5 19 ,II 7 16 9 13
12 55 95 35 80. 23 60 13 38 6 20 ,12 8 17 to 14
13 63 98 41 86 24 64 15 41 7 21 ,13 9 18 II 15
14 69100 47 86 29 68 16 43 8 22 ,14 10 19 12 16
IS 80100 53 90 32 68 18 44 9 24 ,IS to 20 13 17
,Ijj . 21 14 18
16 58 93 32 71 20. 46 9 25 II
17 64 % 36 76 21 48 10. 26 ,17 12 22 15 19
18 70 99 40 77 23 50. II 27 ,18 13 23 16 21
19 77100 44 80. 25 53 12 28 .19 14 24 17 22
20 85 100 47 83 27 55 13 29 ,20 15 26 18 23
21 52 84 28 57 14 30 ,21 16 27 19 24
22 56 87 30 59 14 31 ,22 17 28 19 25
23 I 59 90 32 61 IS 32 ,23 18 29 20. 26
24 "
63 91 34 63 16 3l ,24 19 30 21 27
25 67 94 36 64 17 35 ,25 20. 31 22 28
26 70. 96 37 66 18 36 ,26 20 32 23 29
27 75 98 39 68 19 37 ,27 21 H 24 30
28 79 99 41 70. 19 38 ,28 22 34 25 31
29 83 100 43 72 20 39 ,29 23 35 26 32
30 90100 45 73 21 40 ,30 24 36 27 33
31 47 75 22 41 ,31 :IS 37 28 34
32 50 77 23 42 ,32 26 38 29 35
33 52 79 24 43 ,33 27 39 30 36
34 54 80 25 44 ,34 28 40 31 37
35 56 82 26 45 ,35 29 41 32 38
)6 57 84 27 46 ,)6 30. 42 33 39
37 59 85 28 47 ,37 31 43 34 40
38 62 87 28 48 .38 32 44 35 41
39 64 88 29 49 ,39 33 45 36 42
40 66 90. 30. 50. .40 34 46 37 43
41 69 91 31 51 .41 35 47 38 44
42 71 93 32 52 .42 36 48 39 45
43 7J 94 33 53 .43 37 49 40 46
44 76 95 34 54 ,44 38 50. 41 47
45 78 97 35 55 .45 39 51 42 48
46 81 98 36 56 .46 40 52 43 49
47 83 93 37 57 .47 41 53 44 50
48 86 100 38 58 ,48 42 54 45 51
49 89 100 39 59 .49 43 55 46 52
50. 93 100 40 60 .50 44 56 47 53
t tt tt
• Reference (I) at end of chapter.
t If f exceeds SO, read 100 - f = number observed and subtract each confidence limit
from 100.
tt Iffin exceeds 0.50. read 1.00 - fin == fraction observed and subtract each confidence
limit from 100.
1
TABL.E lA. J --{Continued)
~/q CO"''FIDENC~ INTEl. V,)L (PER CENT) FOR BINOMIAL DIsTRIBVTJON (J)'"

Number Size of Sample, n Fraction Size of Sample


Obsorved Observed
f 10 15 20 30 50 100 fl· 250 1000

0 0 38 0 28 0 21
0 16 0 10 0 5 0.00 0 2' 0 I
I 0 52 0 38 0 30
0 21 0 14 0 7 .01 0 5 0 2
2 I 63 I 47 0 38
0 26 0 17 0 9 .02 I 6 I 3
3 4 71 3 54 2 43
I 31 I 20 0 10 .03 I 7 2 4
4 9 19 S ~~ 4 S\) 1 ~S I l~ \ 11 .(14 1 9 1 ~
5 15 85 9 68 6 58 4 39 2 26 I 13 .05 2 10 3 7
6 21 91 13 73 9 61 6 43 ) 29 2 14 .~ 3 II 4 8
7 29 96 17 78 J2 b4 8 47 4 31 2 16 .07 3 13 5 9
8 37 99 22 83 16 71 10 51 6 33 3 17 .08 4 14 6 10
9 48100 27 87 20 7J 12 54 7 36 3 18 .09 5 IS 7 12
10 62 100 32 91 20 80
15 57 8 38 4 19 .10 6 16 8 13
II 37 95 27 80
15 62 10 40 4 20 .11 6 17 9 14
12 46 97 29 84
19 <>6 II 43 5 21 .12 7 18 9 IS
13 53 99 )6 88
20 b8 12 45 6 23 .13 8 19 10 16
14 62 100 39 91
24 70 14 47 6 24 .14 9 20 II 17
1.1 72 100 42 94
25 75 15 49 7 26 .15 9 22 12 18
16 SO 96
30 76 17 51 8 27 .16 10 23 13 19
17 57 98
32 80 18 53 9 29 .17 11 24 14 20
18 62 100 34 81 20 55 9 30 .18 12 25 15 21
19 70 100 )S 85 21 57 10 31 .19 13 26 16 22
20 79 100 43 85 23 59 II 32 .20 14 27 17 23
21 41' 88 24 61 12 )3 .21 15 28 18 24
22 49 90 26 63 12 )4 .22 16 30 19 26
23 53 n 28 65 I) 35 .23 17 31 20 27
24 57 94 29 67 14 36 .24 IS 32 21 28
25 61 96 31 69 15 3S .25 18 33 22 29
26 65 98 JJ 71 16 39 .26 19 34 22 30
27 69 99 35 72 16 40 .27 20 35 23 31
28 74 100 37 74 17 41 28 21 36 24 32
29 79 100 39 76 18 42 .29 22 37 25 33
30 84 100 41 77 19 43 .30 23 38 26 34
31 43 79 20 44 .31 24 39 27 35
32 45 80 21 45 .32 25 40 28 36
33 47 82 21 46 .33 26 41 29 37
34 49 83 22 47 .34 26 42 30 38
35 II 85 23 48 .35 27 43 31 39
36 53 85 24 49 .36 28 44 32 40
37 55 88 25 50 .37 29 45 33 41
38 57 89 26 51 .38 30 46 34 42
39 60 90 27 52 .39 31 47 35 43
40 62 92 28 53 .40 32 48 36 44
41 b4 93 29 54 .41 ]] 50 37 45
42 67 94 29 55 .42 34 51 38 46
43 69 96 3r. S6 .43 35 52 39 47
44 71 97 31 57 .44 36 53 40 48
45 74 93 32 58 .45 37 54 41 49
46 77 99 33 59 .46 38 55 42 50
~ ~ 44 52
47 80 99 34 60 .47 43 51
48 83 100 35 61 .48
49 86 100 36 62 .49 41 57 .5 53
50 90100 37 63 .50 2 5i 46 54
t tt tt

• Reference (1) at end of chapter.


t Iff exceeds SO, read 100 - f = number observed and subtract each confideno.;: limit
from 100.
tt If/In exceeds 0.50, read 1.00 - /In = fraction observed and subtract each confidtnce
limit from 100.
8 Chapter I: Sompling of AHrihufe.
filled with beans, some white and some colored, thoroughly mixed. Dip
out a scoopful of them at random, count the number of each color and
calculate the percentage of white, say 40%. Now this is not only a count
of the percentage of white beans in the sample but it is an estimate of the
fraction of white beans in the bin. How close an estimate is it? That is
where the second'inference comes in. If there were 250 beans in the scoop,
we look at the table for size of sample 250, fraction'observed = 0,40. From
the table we say with confidence tbat the percentage of white beans in the
bin is betweed 34% and 46%.
So far we have given no measure of the amount of confidence which
can be placed in the second inference. The table heading is "95% Con-
fidence Interval," indicating a degree of confidence that can be described
as follows: If the sampling is repeated indefinitely, each sample leading to
a new confidence interval (that is, to a new interval estimate), then in 95%
of the samples the interval will cover the true population percentage. If
one makes a practice of sampling and if for each sample he states that the
population percentage lies within the corresponding confidence interval,
about 95% of his statements will be correct. Otber and briefer deSCriptions
will be proposed later.
If you feel unsafe in making inferences with the chance of being
wrong in 5% of your statements, you may use the second part of the table,
"99"10 Confidence Interval." For the Boone County sampling the interval
widens to 13%-35%. If one says that the population percentage lies with-
in these limits, he will be right unless a one-in-a-hundred chance has oc-
curred in the sampling.
If the size of the popUlation is known, as it is in the case of Boone
County farmers, the point and interval estimates can be expanded from
percentages to numbers of individuals. There were 2,300 farmers in the
county. Thus we estimate the number of sprayers in Boone County in
1950 as
(0.23)(2,300) = 529 farmers
In the same way, since the 95% confidence interval extends from 15%
to 32% of the farmers, the.95% limits for the number of farmers who
sprayed are '
(0.15)(2.300) = 345 farmers: and (0.32)(2,300) = 736 farmers
Two points about interval estimates need emphasis. First, the con-
fidence statement is a statement about the population ratio, nol aboul
the ralio in other samples that mighl be drawn. Second, the uncertainty
involved comes from the sampling process. Each sample specifies an
interval estimate. Whether or not the interval happens to include the
fixed population ratio is a hazard of the process. Theoretically, the 95~~
confidence intervals are determined SO that 95% of them will COver the
true value.
Before a sample is drawn, one can specify the probability of the truth
9
of his prospective confidence statement. He can say, "I expect to take a
random sample and to make an interval estimate from it. The probability
is 0.95 that the inteLVal will cover the population fraction." After the
sample is drawn, however, the confidence statement is either true or it is
false. Consequently, in reporting the results of the Boone County sam-
pling, it would be incorrect to say, "The probability is 0.95 that the number
of sprayers in Boone County in 1950 lies between 345 and 736." This
logical point is a subtle one, and does not weaken the effectiveness of
confidence interval statements. In a specific application, we do not know
whether our confidence statement is one of the 95% that are correct or one
of the 5% that are wrong. There are methods, in particular the method
known as the Bayesian approach, that provide more definite probability
statements about a single specific application, but they require more
assumptions about the nature of the population that is being sampled.
The heading of this chapter is "Sampling of Attributes." In the
numerical example the attribute in question was whether the farm had
been sprayed or not. The possession or lack of an attribute distinguishes
the two classes of individuals making up the popUlation. The data from
the sample consist of the numbers of members of the sample found to have
or to lack the attribute under investigation. The sampling of populations
with two attributes is very common. ExampJes are Yes or No answers to
a question, Success or Failure in some task, patients Improved or Not
Improced under a medical treatment, and persons who Like or Dislike
some proposal. Later (chapter 9) we shall study the sampling of popula-
tions that have more than two kinds ofattributes, such as persons who are
Strongly Farorable, Mildly Favorable, Neutral, Mildly UnJavorable. or
Strongly Un(awrable to some proposal. The theory and methods for
measurement data. such as heights, weights, or ages, will be considered
in chapter 2.
This brief preview displays a goodly portion of the wares that the
statistician has to offer: the sampling of populations. examination of the
facts turned up by the sample. and, based on these facts. inferences about
the sampled popUlation. Before going further,. you may clarify your
thinking by working a few examples.
Examples form an essential part of our presentation of statistics.
In each list they are graded so that you may start with the easier. It is
suggested that a rew in each group be worked after the first reading of the
text. reserving the more difficult until experience is enlarged. Statistics
cannot he mastered without this or similar practice.
EXAMPLE 1.4.1-In controlling the quality of a mass-produced article in ind_ustry. a
random sample of l~ articles from a large lot were ~ch tested for effectiveness. Ninety-
tw~ wer.e found effective. What are the 99% confidencd limits for the percentage of effective
articles In the whole lot? Ans. 83% and 97%. Hint: look: in the table for 100 - 92 ... 8.

o EXA~PLE ~ .4.2-lf 1,000 articles in (he preceding example had been tested and only
~% found lOeffectlve: w.hat would be the 99% limits? Ans. Bet~een 90% and 94% are effec-
tIVe. Note how the hums have nanowed as a result of the increased sample size.
10 Chapter I: Sampling of Allribute.
EXAMPLE 1.4.3-A sampler of public opinion asked 50 men to express their prefer-
ences between candidates A aod 8. Twenty preferred A. Assuming random sampling from
a population of 5,000. the sampler stated that between 1,350 and 2,750 in the population
preferred A. What confidence interval was he using'? Ans. 95'j~.
EXAMPLE 1.4.4------10 a health survey of adults. 86~~ stated that they had had measles
at some time in the past. On the basis of this sample the statistician asserted that unless a
l·in~20 chance had oCCurred, the percentage of adults in the population who had had measles
was between 81% and 90~~. Assuming random sampling. what was the size of the sample?
Ans. ~50. Note: the statistician's inference may have been incorrect for other reasons.
Some people have a mild attack of measles without realizing it. Others may have forgotten
that they had it. Consequently, the confidence limits may be underestimates for the per-
centage in the population who actuaUy had measles, as distinct from the percentage who
would state that they had it.
EXAMPLE 1.4.5---If in the sample of 100 Boone County farmers none had sprayed,
wbat 95~~ conliddu:c- statement would fOU make about the farmers in tire county? Ans.
Between none and 4~·~ sprayed. But suppose that all fanners in the sample were sprayers,
what 'is the 99'\, confidence interval? Ans. 95%~lOO~{,.
EXAMPLE J .4.t).-----If you guess that in a certain population between 25~·~ and 75~~ of
the housewives own a specified appliance, and if you wish to draw a sample that will, at the
95~"; confidence level, yield an estimate differing by not more than 6 from the correct percent-
age. about how large a sample must you take':' Ans. 250.
EXAMPLE 1.4.?-An investigator interviewed 115 women over 40 years of age from
the lower middle economic level in rural areas of mid die western slates. Forty-six of them had
listened to a certain radio program three Ot more times during the preceding month. As-
suming random sampling, what statement can be made about the- percentage of women
listening in the population. using the 99% interval':', __ Ans: Approximately. between 28.4~/~
and 52,5% listen. You will need to interpolate between the results for n >: 100 and n == 250.
Appendix A 1 (p. 541, gives hints on interpolation.
EXAMPLE l.4.g-For samples that show 50% in a certain class. write down the width
of the 95%oonfidence interval for n = 10,20,30,50, 100,250, and 1.000. For each sample
size n. multiply the width of the interval by yin. Show that rhl:: product is always near 200.
This means that the wtdth'of the interval is approximately related to the sample size by the
formula W = 200/-..'/n. We say Ihat the width goes down as J/.'/n.

1.S-Random sampling. The ~onfidence intervals in table 1.4.1 were


computed mathematically on the assumption that the data are a random
sample lrom the population. In its simplest form, random sampling
means that every member of the \)0pulation has an equal chance of ap-
pearing in the sample, independently of the other members that happen
to fan in the sample. Suppose that the population has four members,
numbered 1,2,3,4, and that we ate drawing samples of size two. There
are ten possible samples that contain two members: namely, (I, 2), (I, J),
(1,4), (2, 3), (2, 4), (3, 4), (I, 1), (2, 2), (3, 3), and (4, 4). With simple
random sampling, each of these ten samples has an equal chance of being
the sample that is drawn. Notice two things. Every member appears
once in three samples and twice in one sample, so that the sampling shows
no favoritism as between one member and another. Secondly, look at
the four samples in which a I appears, (1,2), (I, 3), (I, 4), and (I, I). The
second member is equally likely to be a 1. 2. 3, or 4. Thus. if we are told
that 1 has been drawn as the first member of the sample, we know that
each member of the population still has an equal chance of being the sec-
"
ond member of the sample. This is what is meant by the phrase "inde-
pendently of the other members that happen to fan in the sample."
A common variant of this method of sampling is to allow any mem-
ber of the population to appear only once in the sample. There are then
six possible samples of size two: (1, 2), (1, 3), (I, 4), (2, 3), (2, 4), and (3, 4).
This is the kind of sampling that occurs when two numbers are drawn out
of a hat. no number being replaced in the hat. This type of sampling is
called random sampling without replacement, whereas the sampling de-
scribed in the preceding paragraph is random sampling with replacement.
If the sample is a small fraction of the population, the two methods are
practically identical, since the possibility that the same item appears
more than once in a sample is negligible. Throughout most of the book
we shall not distinguish between the two methods. In chapter 17, for-
mulas applicable to sampling without replacement are presented.
There are more complex types of random sampling. In all of them,
every member of the population has a known probability of coming into
the sample, but these probabilities may not be equal or they may depend,
in a known way, on the other members that are in the sample. In the
Boone County sampling a book was available showing the location of
every farm in the county. Each farm was numbered so that a random
sample could have been drawn by mixing the numbers thoroughly in a
box, then having a hundred of them drawn by a blindfolded person.
Actually, the samplers used a scheme known as stratified random sampling.
From the farms in each township (a subdivision of tho. county) they drew
a random sample with a size proportional to the number of farms in that
township. In this example, each farm still has an equal chance of appear'
ing in the sample. but the sample is constructed to contain a specified
number from every township. The chief advantage is to spread the sam-
ple more uniformly over the county, retaining the principle of random-
ness within each township. Statistical methods for stratified samples
are presented in chapter 17. The conclusions are only slightly altered by
considering the sample completely random. Unless otherwise mentioned,
we will use the phrases "random sample" and "random sampling" to
denote the simplest type of random sampling with replacement as de-
scribed in the first paragraph of this section.
An important feature of all random sampling schemes is that the
sampler has no control over the specific choice of the units that appear
in the sample. If he exercises judgment in this selection, by choosing
"typical" members or excluding members that appear "atypical," his
results are not amenable to probability theory, and confidence intervals,
which give valuable information about the accuracy of estimates made
from the sample, cannot be constructed.
In some cases the population is thoroughly mixed before the sample
is taken, as illustrated by the mascerating and blending of food or other
chemical products. by a naturally mixed aggregate such as the blood
12 ClKlpter I: Sampling of AltriIwt..
stream, or by the sampling of a liquid from a vat that has been repeatedly
stirred. Given an assurance of thorough mixing, the sample can be drawn
from the most accessible part of the population, because any sample
should give closely similar results. But complete mixing in this sense is
often harder to achieve than is realized. With populations that are vari-
able but show no clear pattern of variation, there is a temptation to con-
dude that the population is naturally mixed in a random fashion, so that
any convenient sample will behave like one randomly drawn. This
assumption is hazardous, and is difficult to verify without a special in-
vestigation.
One way of drawing a random sample is to list the members of the
population in some order and write these numbers on slips of paper,
marbles, beans, or small pieces of cardboard. These are placed in a box or
bag, mixed carefully, and drawn out, with eyes shut, one by one until
the desired size of sample is reached. With small populations this method
is convenient. and was much used in the past for cJassroom exercises.
It has two disadvantages. With large populations it is slow and unwieldy.
Further, tests sometimes show that if a large number of samples are drawn,
the samples differ from random samples in a noticeable way, for instance
by having certain members of the popUlation present more frequently
thaD they should be. In other words, the mixing was imperfect.
1.6-Tables of random digits. Nowadays, samples are mostly drawn
by the us< of tables of random digits. These tables are produced by a
process-usually mechanical or electrical-that gives each of the digits
from 0 to 9 an equal chance of appearing at every draw. Before publica-
tion of the tables, the results of the drawings are checked in numerous
ways to ensure that the tables do not depart materially from randomness
in a manner that would vitiate the commonest usages of the tables. Table
A I (p.543) contains 10,000 such digits, arranged in 5 x 5 blocks to facili-
tate reading. There are 100 rows and 100 columns, each numbered from
·00 to 99. Table 1.6.1 shows the first 100 numbers from this table.
The chaotic appearance of the set of numbers is evident. To illus-
trate how the table is used with attribute data, suppose that 50~1, of the
members of a popUlation answer "Yes" to some question. We wish to
study how well the proportion answering "Yes" is estimated from a sam-

TABLE 1.6.1
ONE HUNDilED RANDOM DiGiTS FROM T"BLE A I

0041 05-<)9 10-14 15-19

00 54463 22662 65905 1Il639


01 15389 85205 18850 39226
02 85941 40756 82414 02015
03 61149 69440 11286 88218
04 05219 SI619 10651 67079
13
pie of size 20. A "Yes" answer can be repre,ented by the appearance
of one of the digits 0, 1, 2, '3, 4, or alternatively by the appearance of an
odd digit. With either choice, the probability of a "Yes" at any draw
in the table is one-half. We shall choose the digits 0, I. 2. 3, 4 to represent
"Yes," and let each row represent a different sample of size 20. A
count, much quicker than drawing slips of paper from a box, shows
that the successive rows in table 1.6.1 contain 9, 9, 12, 11, and 9 "Yes"
answers. Thus, the proportions of "Yes" answers in these five samples
of size 20 are, respectively, 0.45, 0.45, 0.60, 0.55, and 0.45. Continuing
in this way we can produce estimates of the proportion of "Yes" an-
swers given by a large number of separate samples of size 20. and then
examine how close the estimates are to the population value. In count-
ing the row numbered 02, you may notice a run of results that is typical
of random sampling. The row ends with a succession of eight consecu-
tive "Yes" answers, followed by a single "No." Observing this phe-
nomenon by itself, one might be inclined to conclude that the proportion
in the population must be larger than one-half, or that something is
wrong with the sampling process.
Table A I can also be used 10 investigate sampling in which the pro-
portion in the population is any of the numbers 0.1, 0.2, 0.3, ... 0.9.
With 0.3. for example. we let the digits O. I. or 2 represent the presence of
the attribute and the remaining seven digits its ifbsence. If you are inter-
ested in a population in which the proportion is 0.37. the method is to select
pairs of digits, letting any pair between 00 and 36 denote the presence of
the attribute. Tables of random digits are employed in studying a wide
range of sampling problems. You can probably see how to use them to
answer such questions as: On the average, how many digits must be taken
until a 1 appears °-or, How frequently does a 3 appear before either a
I or a 9 has appeared 0 In fact, sampling from tables of random digits
has become an important technique for solving difficult problems in
probability for which no mathematical solution is known at present.
This technique goes by the not inappropriate name of the Monte Carlo
method. For this reason, modern electronic computing machines have
programs available for creating their own tables of random digits as they
proceed with their calculations.
To the reader who is using random numbers for his own pu'poses,
we suggest that he start on the first page and proceed systematically
through the table. At the end of any problem, note the rows and columns
used and the direction taken in counting. This is sometimes needed for
later reference or in communicating the results to others. Since no digit
is used more than once, the table may become exhausted, but numerous
tables are available. Reference (2) contains 1 million digits. In classroom
use, when a number of students are working from the same table, obtain-
ing samples whose results will be put together. different students can start·
at different parts of the table and also vary the direction in which they
proceed, in order to avoid duplicating the results of others.
14 Chapter I: Sampling of Attribute.
I. 7-Confidence interval: verification of theory. One who draws
samples from a known population is likely to be surprised at the capricious
way in which the items turn up. It is a salutary discipline for a student
or investigator to observe the laws of chance in action lest he become too
confident of his professional samplings. At this point we recommend that
a number of samples be selected from a population in which the propor-
tion of "Yes" answers is one-half. Vary the sample sizes. choosing some
of eac\!. of the sizes 10, 15,20,30,50, 100. and 250 for which confidence
intervals are given in table 104.1 (1,000 is too large). For each sample,
record the sample sizes and the set of rows and columns used in the table
of random digits. From the number of "Yes" answers and the sample
size, read table 1.4.1 to find the 95% and 99~~ confidence intervals for the
percentage of "Yes" answers in the popUlation. For each sample, you
can then verify whether the confidence interval actually covers 50%. If
possible, draw 100-or more samples, since a large number of~amples is
necessary tor any close verification of the theory, particularly with 99~~
intervals. In a classroom exercise it is wise to arrange for combined
presentation and discussion of the results from the whole class. Preserve
the results (sample sizes and numbers of "Yes" answers) since they Will
be used again later.
You have now done experimentally what the mathematical statis-
tician does theoretically when he studies the distribution of samples
drawn at random from a specified population.
For illustration, suppose that an odd digit represents a "Yes"
answer, and that the first sample, of size 50, is the first column of table A I.
Counting down the column, you will find 24 odd digits. From table 1.4.1,
the 95'1, confidence interval extends from 36% to 64%, a correct verdict
because it includes the population value of 50'1.. But suppose one of your
samples of 250 had started at row 85. column 23. Moving down the suc-
cessive columns you would count only 101 or 40.4,%; odd and would
assert that the true value is between 34% and '46~/~. You would be wrong
despite the'fa~t that the sample is randomly drawn from the same popu-
lation as the others. This sample merely happens to be unusually diver-
gent. You should find about five samples in a hundred leading to in-
correct statements, but there will be no occasion for surprise if only three,
or as many as seven, turn up. With confidence probability 99% you ex-
pect, of course, only about one statement in a hundred to be wrong. We
hope that your results are sufficiently concordant with theory to give
you confidence in it. You will certainly be more aware of the vagaries
of sampling, and this is one of the objectives of the experiment. Another
lesson to be learned is that only broad confidence intervals can be based
on small samples, and that even so the inference can be wrong.
Finally, as is evident in table 1.4.1, you may have observed that the
interval narrows rather slowly with increasing sample size. For samples
of size 100 that show a percentage of "Yes" answers anywhere between
40% and 60%, the 95% confidence interval is consistently of width 20%.
15
With a sample ten times as large (n = 1,000) the width of the interval de-
creases to 6%. The width goes down roughly as the square root of the
sample size, since 20/6 is 3.3 and .JIO is 3.2 (this result was verified in
example 1.4.8).
Failure to make correct inferences in a small portion of the samples
is not a fault that can be remedied, but a fault inevitably bound up in the
sampling procedure. Fallibility is in the very nature of such evidence.
The sampler can only take available precautions, then prepare himself for
his share of mistakes. In this he is not alone. The journalist, the judge,
the banker, the weather forecaster-these along with the rest of us are
subject to the laws of chance, and each makes his own quota of wrong
guesses. The statistician has this advantage: he can, in favorable circum-
stances, know his likelihood of error.
1.8-The sampled population. Thus far we have learned that if we
want to obtain some information about a population that is too large to
be completely studied, one way to do this is to draw a random sample
and construct point and interval estimates, as in the Boone County exam-
ple. This technique of making inferences from sample to population is
one of the principal tools in the analysis of data. The data, of course,
represent the sample, but the concept of the population requires further
discussion. In many investigations in which data are collected, the popu-
lation is quite specific, apart possibly from some problems of definition:
the patients in a hospital on a particular day, the payments received by a
firm during the preceding year, and 30 on. In such cases the investigator
often proceeds to select a simple random sample, or one of the more
elaborate methods of sampling to be presented in chapter 17, and makes
inferences directly from his sample to his population.
With a human popUlation, however, the popUlation actually sampled
may be narrower than the original popUlation because some persons
drawn into the sample cannot be located, are ill, or refuse to answer the
questions asked. Non-responses of this kind in 5% to 15% of the sample
are not uncommon. The popUlation to which statistical inferences apply
must be regarded as the aggregate of persons who would supply answers
if drawn into the sample. '
Further, for reasons of feasibility or expense, much research is carried
out on populations that are greatly restricted as compared to the popUla-
tion about which, ideally, the investigator would like to gain information.
In psychology and education the investigator may concentrate on the
students at a particular university, although he hopes to find results that
apply to all young men in the country of college age. If the measuring
process is troublesome to the person being measured, the research worker
may have to depend on paid volunteers. In laboratory research on ani-
mals the sample may be drawn from the latest group of animals" sent from
the supply house. In many of these cases the sampled population, from
the viewpoint of statistical inference, is hard to define concretely. It is the
kind of population of which the data can be regarded as a random sample.
16 CItopter I, $amplinp 01 AIIriIoulel
Co'!fidence interval statements apply to the population thaI was
actually sampled. Claims that such inferences apply to some more exten-
sive population must rest on the judgment of the investigator or on addi-
tional extraneous information that he possesses. Careful investigators
take pains to describe any relevant characteristics of their data in orde~
that the reader can envisage the nature of the sampled population. The
investigator may also comment on ways in wIDch IDS sampled population
appears to differ from some broader population that is of particular
interest. As is not surprising, results soundly established in narrow popu-
lations are sometimes shown to be erroneous in much broader popula·
tions. Fortunately, local studies that claim important results are usually
repeated by investigators in other parts of the country or the world, SO
that a more extensive target population is at least partially sampled in
tIDS way.
1.9-The fr""""""y distribution and its graphical representation.
One group of students drew 200 samples, each of size 1O. The combined
results are compactly summarized in a frequency distribution, shown in
table 1.9.1. There are only eleven possible results for the numher of odd
digits ill a sample, namely the integers 0, 1,2, ... 10. Consequently, the
frequency distribution has eleven classes. The number of samples out of
the 200 that faU into a class is the class frequency. The sum of the class
frequencies is, of course, the total number of samples drawn, 200. The
classes and their frequencies give a complete summary of the drawings.
TIDs type of frequency distribution is called discrele, because the
variable, number of odd digits, can take only a limited number of distinct
values. Later we shall meet continuous frequency distributions, wIDch are
extensively u.ro with measurement data.
One striking feature of the sampling distribution is the conecntra-

TABLE 1.9.1
FIU3Q~C)' D~IBUTION OF Nuvans OF <;>Do DIGITS IN 200 SAMPLES Of n === 10

Class Class Theoretical


(NumbcT of Odd Diaits) Ftequency Class Frequency

0 1 0.2
1 1 2.0
2 8 U
3 2S 23.4
4 39 41.0
S 4S 49.2
6 36 41.0
7 2S 23.4
8 16 &.&
9 4 2.0
10 0 0.2

Total Frequency 200 200.0


11
tion of frequencies near the middle of the table. The greatest frequency
is in the class of five odd digits; that is, half odd and half even. The three
middle classes, 4, 5, 6, contain 39 + 45 + 36 = 120 samples, more than
half of the total frequency. This central tendency is the characteristic
that gives us confidence in sampling-most samples furnish close esti-
mates of the population fraction of odds. This should counterbalance the
perhaps discouraging fact that some of the samples are notably divergent.
Another interesting feature is the symmetry of the distribution, the
greatest frequency at the center with a trailing away at each end. This is
because the population fraction is 50%; if the percentage were nearer zero
or 100, the frequencies would pile up at or near one end.
The regularity that has appeared in the distribution shows that chance
events follow a definite law. The turning up of odd digits as you counted
them may have seemed wholly erratic: whether an odd or an even would
come next was a purely chance ~vent. But the summary of many such
events reveals a pattern which may be predicted (aside from sampling
variation).
Instead of showing the class frequencies in table 1.9.1, we might have
divided each class frequency by 200, the number of samples, obtaining a
set of ,elatire class frequencies that add to I. As the number of samples is
increased indefinitely, these relative frequencies tend to certain fixed
values that can be calculated from the theory of probability. The theoreti-
cal distribution computed in this way is known as the binomial distribu-
tion. It is one of the commonest distributions in statistical work. In
general terms, the formula for the binomial distribution is as follows.
Suppose that we are drawing samples of size n and that the attribute in
question is held by a proportionp of the members of the population. The
relative frequency of samples containing' members having the attribute,
or in other words the probability that a .sample will contain r members
having the attribute, is

n(n - I)(n - 2) ... (n - r + I) P'(I _ pr-'


,(r - I)(r - 2). " (2)(1)
In the numerator the expression n(n - I )(n - 2) ... (n - r + I) means
"multiply together all the integers from n down to (n - , + I), inclusive."
Similarly, the first term in the denominator is a shorthand way of writing
the instruction "multiply together all integers from r down to I." We
shall study the binomial distribution and its mathematical derivation in
chapter 8.
What does this distribution look like for our sampling in table 1.9.1?
We have n = 10 and p = 1/2. The relative frequency or probability of a
..mpl. having four odd digits is, putting r = 4 so that (n - r + I) = 7,

(I0)(9)(8)(7)(!)'(!)6 = (210)(!)10 = 210


(4)(3)(2)(1) 2 2 2 1024
18 Chapler I: Sampling "I Allributef
As already mentioned, these reliltive frequencies add to I. (This is
not obvious by looking at the formula, but comes from a well-known
result in algebra.) Hence, in our 200 samples of size 10, the number that
should theoretically have four odd digits is
(200)(210)
41.0
1024
These theoretical class frequencies are given in the last column of table
1.9.1. The agreement between the lJctual and theoretical frequencies is
pleasing. .
The graph in figure 1.9.1 brin~s out the features of tbe binomial
distribution. On the horizontal axis .. re marked off the different classes-
the numbers of odd digits. The solid ordinate beside each class number
is the observed class frequency while the dotted ordinate represents the
theoretical frequency. This is tbe type of grapb appropriate for discrete
distributions.

-Sample
50 f-- - - - Theoretical

40 f-- I I
I I
I I
I I
>- I I
<.) I I
c: 30 f-- I
Q)
::> I
C' I
.._ \
'"
u.. • \
I I
I
20 f-- I
I
"
I
I
I
,
I
I
I
10 - I
I
I
I
I
I
I
I
I
I
I I I I

o I. '~
I
I I I I
I:
o 2 456
3 7 8 9 10
Number of Odd Digits
FIG. 1.9.l-Frequency distribution of number of odd digits in each of 200 samples of size
10. The dotted lines represent the theoretical binomial distribution from which the samples
werearawn.
(i)
EXAMPLE 1.9.I-For the 200 samples of size 10 in table 1.9.1. in how many cases is
the 95% confidence interval statement wrong? (ii) the 99V~ confideoce interval statement
"
wrong? Ans. (i) 6 times. or 3.0%: (ii) I time, or 0.5%.
EXAMPLE 1.9.2-Use the table of random digits to select a random sample of 20
pages ofthis book, regarding the population as consisting of pages 3-539. Note the number
of pages in your sample that do not contain the beginning: of a new section, and calculate
(he 95% interval for the proportion of pages' in'the book on which no new section begins.
Don't count "References" as a section. The popUlation proportion ls 311/537 = 0.59.
EXAMPLE 1.9.3--Whcn the doors of a clinic are opened. twelve patients enter simul-
taneously. Each patient wishes to be handled first. Can you use the random digit tabJe to
arrange the patients in a random order?
EXAMPLE 1.9.4-A sampler of public opinion estimates from a sample tbe number of
eligible voters in a state favoring a certain candidate for governor. Assuming that his esti-
mate was close to the population value at the time the survey was made suggest two teasons
j

why the ballot on election day might be quite different.


EXAMPl.E 1.9.5-A random sample of families from a population bas been selected..
An interviewer calls on each family at its bome between the hours of9 A.M. and 5-p.M. If
no one is ott home. the interviewer makes no attempt to contact the family at slater time. For
each of the following attributes. give your opinion whether the sample results are likely to
overestimate. underestimate. or be at about the correct level: (i) proportion of families in
which coe husband is "retired. (ii) proportion o(families with at least one child under 4 yean,
{iii} proportion of families in whi<:h husband and wife both work. Give YOUt reasons.
EXAMPLE 1.9.6-From the: formula for the binomial distribution. calculate the prob.-
ability of O. I. 2 "'Yes" answers in a sample of size 2, where p is the proportion of "Yes"
answers in the popuJation. Show that the three probability values add to 1 for any value of p.
EXAMPLE 1.9.7-At birth the probability that a child is a boy is very close to one·
half. Show thai according to the binomial distribution. balf the families of si~e 2 ~how.d
consist of one boy and one girl. Why is the proportion of boy-girl families likely to be Slightly
\(SS than one·half in practice'"

EXAMPLE 1.9.8·-~Five dice were tossed 100 times. At each toss the number oftwo's
(deuces) out oftive were noted. with these resultS:

Number Deuces Frequellcy of Theoretical


Per Toss Oc(,:urrence Frequency

5 2 o.oll
,
4
,
3 0.322
3.214
16.075
2 18
1 42 4{).188
0 32 4{).188

Total 100 100.000

(i) From the binomial distribution, verify the result 16.075 for the theoretical frequency
,Jf.2 deuceS. Iii) Draw a graph showing the observed and'theoretical distributions. (iii) Do
~·tlU Ihink the dice were balanced and fairly tossed? Ans. The binomial probability of 2
J.euces i:. 1250/7776 = 0.16075. This is multiplied by 100 to give the theoretical frequency.
A later. test (example 9.5.1) casts doubt on the truO'4-1.
t.tO-Hypotheses about populations. The investigator often has in
mind a definite hypothesis about the population ratio, the purpose of
the sampling being to get evidence concerning his hypothesis. Thus a
geneticist studying heredity in the tomato had reason to believe that in
the plantS produced from a certain cross, fruits with red nesh and yellow
nesh would be in the ratio 3: 1. In a sample of 400 he found 310 red toma-
toes instead of the hypothetical 300. With your experience of sampling
variation, would you accept this as verification or refutation of the hy-
pothesis? Again, a physician has the· hypothesis that a certain disease
requiring hospitalization is equally common among men and women.
In a sample of 900 hospital cases he finds 480 men and 420 women. Do
these results support or contradict his hypothesis? (Incidentally, this is
an example in which the sampled population may differ from the target
population. Although good medical practice may prescribe hospitaliza-
tion, there are often cases that for one reason or another do not come to
a hospital and therefore could not be included in his sample.)
To answer such questions two results are needed, a measure orthe
deviation of the sample from the hypothetical popUlation ratio, and a
means ofjudging whether this measure is an amount that would commonly
occur in sampling, or, on the contrary, is so great as to throw doubt upon
the hypothesis. Both results were furnished by Karl Pearson in 1899 (3).
He devised an index of dispersion or test criterion denoted by X' (chi-
square) and obtained the formula for its theoretical frequency distribution
when the hypothesis in question is true. Like the binomial distribution,
the chi-square distribution is another of the basic theoretical distributions
much used in statistical work. Let us first examine the index of dispersion.
I.II-Chi-square, an index of dispersion. Naturally, the deviations
of the observed numbers from those specified by the hypothesis form the
basis' of the index. In the medical example, with 900 cases, the numbers
of male and female cases expected on the hypothesis are each 450. The
deviations-. tpen, are
480 - 450 = +30,
and
420 - 450 = -30,
the sum of the two being zero. The value of chi-square is given by
2 _ (+ 30>' ( - 30>' _ 2 2- 4
1. - 450 + 450 - + -
Each deviation is -squared, each square is divided by the hypothetical or
expected number, and the results are added. The expected numbers appear
in the denominators in order to introduce sample size into the quantity-
it is the relatit1e size that is important.
The squaring of the deviations in the numerator may puzzle you.
2f
It is a common practice in statistics. We shall simply say at present that
indexes constructed in this way have been found to have great flexibility.
being applicable to many different types of statistical data. Note that the
squaring makes the sign of the deviation unimportant, since the square of a
negative number is the same as that of the corresponding positive number.
It is clear that chi-square would be zero if the sample frequencies were.the
same as the hypothetical, and that it will increase with increasing deviation
from the hypothetical. But it is not at all clear whether a chi-square value
of 4 is to be considered large, medium, or small.
To furnish a basis for judgment on this point is our next aim. Pearson
founded hi. judgment from a study of the theoretical distribution of chi-
square, but we shall investigate the same problem by setting up a sampling
experiment. Before doing this, a useful formula will be given, together
with a few examples to help fix it in mind.
1.1l-The formula for chi-square. It is convenient to represent by
II and I, the sample counts of individuals who do and do not possess the
attribute being investigated, the corresponding hypothetical or expected
frequencies being FI and F,. The two deviations, then, are II - FI and
I, - F" so that chi-square is given by the formula.
X' = (/1 - FI)'/FI + (I, - F,)'/F,
The formula may be condensed to the more easily remembered as well as
more general one,
X' = J:.(/ - F)' /F,
where t denotes summation. In words, "Chi-square is the sum of such
ratios as
~4!~c...~w.,,- ,,\'J&eJ!\i~"'l.%tJ!l! "-ll!t\be<\"

Let us apply the formula to the counts of red and yellow tomatoes
in section 1.10. There, II = 310, I, = 400 - 310 = 90, FI = 3/4 of
400 = 300, and F, = 1/4 of 400 = 100. Whence,
, _ (310 - 3(0)' (90 - 1(0)' _ 33
X - 300 + 100 -).;_
Note. When computing chi-square it is essential to use the actual size
of sample and the actual numbers in the two altribute classes. lfwe know
only the percentages or proportions in the two classes, chi-square cannol
he calculated. Suppose we are told that 80~~ of the tomato plants in a
sample are red, and asked to compute chi-square. If we guess that the
sample cont~ined 100 plants then

, (80 - 75)' (20 - 25)' 25 25


X = 75 + 25 = 75 + 25 = 1.33
22 CItoptw I, Sampling 01 AItrib_
But if the sample actually contained only 10 plants, then
2 (8 - 7.5)2 (2 - 2.5)2 0.25 0.25
X = 7.5 + 2.5 = 7.5 + 2.5 = 0.133
If the sample had 1,000 plants, a similar calculation finds X2 = 13.33.
For a given percentage red, the value of Chi-square can be anything from
almost zero to a very large number.
EXAMPLE 1.12.I-A student tossed a coin 800 times, getting 440 heads. What is the
value of chi-square in relation to the hypothesis that heads and tails are equally likely?
ADs. 8.
EXAMPLE'I.12.2-lf the count in the preceding example had been 220 heads out of
400 lOsses, would chi-square also be half its original value?
EXAMPLE 1.12.3---A manufacLurer of a small mass-produced article claims that 96%
oftbe articles function properly. In an independent test of 1.000 articles. 950 were found to
function properly. Compute chi-square. Ans. 2.60.
EXAMPLE 1.12.4-10 the text example about tomatoes the deviation from expectation
was 10. If the same deviation had occum:d in a sample of twice the size (tha't is, of 800):
what would have been .the value of chi-square? Ans. 0.67, half the original value.

1.l3-AD experiment in sampling cbi-square; the sampling distribution.


You have now had some practice in the calculation of chi-square. Its
main function is to enable us to judge whether the sample ratio itself de-
parts much or little from the hypothetical population value. For that
purpose we must answer the question already proposed: What values of
chi-square are to be considered as indicating unusual deviation, and what
as ordinary sampling variation 0 Our experimental method of answering
the question will be to calculate chi-square for each of many samples
drawn from the table of random numbers, then to observe what values of
chi-square spring from the more unusual samples. If a large number of
samples of various sizes have been drawn and if the value of chi-square is
computed from each, the-distribution of chi-square may be mapped.
The results to be preserited here come from 230 samples of sizes vary-
ing from 10 to 250, drawn from the random digits table A I. We suggest
that the reader use the samples that he drew in section 1.7 when verifying
the confidence interval statements. There is a quick method of calculat-
ing chi-square for all samples of a given size n. Since odd and even digits
are equally likely in the population, the expected numbers of odd and even
digits are F, = F2 = n12. The reciprocals of these numbers are therefore
both equal to 21n. Remembering that the two deviations are the same in
absolute value and differ only in sign, we may write
X2 = (/, - F,)2(1IF, + IIF2)
= d 2 (21n + 21n)
4d' /n =
where d is the absolute value of the deviation. For all samples of a fixed
size n, the multiplier 4/n is constant. Once it has been calculated it can be
used again and again.
23
To illustrate, suppose that n = 100. The multiplier 4/n is 0.04. If 56
odd digits are found in a sample, d ~ 6 and
X' ~ (0.04)(6') = 1.44
Proceed to calculate chi-square for each of your samples. To
summarize the results, a frequency distribution is again convenient, There
is one difference, however, from the discrete frequency distribution used
in section 1.9 when studying the binomial distribution. With the binomial
for n = 10, there were only eleven possible values for the numbers of odd
digits, so that the eleven classes in the frequency distribution selected
themselves naturally. On the other hand, with chi-square values calcu-
lated from samples of different sizes. there is a large number of possible
values, Some grouping of the values into classes is necessary. A distribu-
tion of this type is sometimes described as continuous. since conceptually
any positive number is a possible value of chi-square,
When forming frequency distributions from continuous data, decide
first on the classes to be used. For most purposes. somewhere between
8 and 20 classes is satisfactory. Obtain an idea of the range of the data
by looking through them quickly to spot low and high values. Most of
your chi-squares will be found to lie between () and 5. Equal-sized class
intervals of 0.00-D.49, 0,So-{),99, .. ,will therefore cover most of the
range in 10 classes, although a few values of chi-square greater than S may
occur. Our values of X' were recorded to 2 decimal places,
Be sure to make the classes non-overlapping, and indicate clearly
what the class intervals are, Class intervals described as "O.Oo-{).50,"
"0.50-1.00," "1.00-1.50" are not satisfactory, since the reader does not
know in what classes the values 0.50 and 1.00 have been placed. If the
chi-square values were originally computed to three decimal places. re-
ported class intervals of "0.00-D,49," ··O,SO-{).99," and so on, would be
TABLE 1.13.1
SAMPLING D,STRIOUTIOr-; Of 230 VALUES OF CHI·SOUAJU: CALCULATED FROM SA~"LFS
DRAWN FROM TABLE A1
Sample sizes,--IO, 15.20.30.50.100. and 250
----
Class Jnterval Frequency Class Interval Frequency

116 6.00- 6.49


"- 0
O.IXHl.49
O.5CH),99 39 6.5<>- 6.99 I
1.00--1.49 18 1.00-- 7.49 0
1.5()..1.99 22 750- 7.99 0
2,()().2.49 12 '.00- 849 0
2.5()..2.99 5 K.50· 8.99 I
3.()()o.3,49 5 I:J.OO- 9.49 0
3.5(),,),99 6 <.).50- 9.99 0
4.()()..449 I IO.(l{)-IO.49 I
4.5()..4,99 2 10.50-10.99 0
5.00-5.49 0 I IJ)()··] 1.49 _I
5.5()"5.99 0 Tl1tal 230
24 CIIapI.r I: ~iIItI of AtIrihut..
ambiguous, since it .is not clear where a chi-square value of 0.493 is placed.
Intervals 0[0.000-0.494, 0.495--0.999, and so on, could be used.
Having deiermined the class intervals, go through the data system-
atically, assigning each value of chi-square to its proper class, then
counting the number of values (frequency) in each class. Table 1.13..1
shows the results for our 230 samples.
In computing chi...quare, we chose to regard the population as con-
sisting of the 10,000 random digits in table A I, rather than as an infinite
population of random digits. Since 5,060 of the digits in table A I are
odd, we took the probability of an odd digit as 0.506 instead of 0.50. The
reader is recommended to use 0.50, as already indicated. The change
makes only minor differences in the distribution of the sample values of
chi-square.
Observe the concentration of sample chi-squares in the smallest class,
practically half of them being less than 0.5. Small deviations (with small
chi-squares) are predominant, this being the foundation of our faith in
sampling. But taking a less optimistic view, one must not overlook the
samples with large deviations and chi-squares. The possibility of getting
one of these makes for caution in drawing conclusions. In this sampling
exercise we know the population ratio and are not led astray by discrepant
samples. In actual investigations, where the hypothesis set up is not
known to be the right one, a large value of chi-square constitutes a dilem-
ma. Shall we say ihat it denotes only an unusual sample from the hy-
120 r--------------------,
100

.,..
(,) 80
...::>
z

...
0 60

...
II:

40

20

3 4 567 8 9 10 11 12
CHI -SQUARE

FIG. 1.13.I-Histogram representing frequem:y distribution of the 230 sample


values of chi-square in table \ .13,1.
25
pathetical population, or shall we conclude that the hypothesis misrepre-
sents the true population ratio? Statistical theory contains no certain
answer. Instead, it furnishes an evaluation of the probability of possible
sample deviations from the hypothetical population. If chi-square is large,
the investigator is warned that the sample is an improbable one under his
hypothesis. This is evidence to be added to that which he already pos-
sesses, all of it being the basis for his decisions. A more exact determina-
tion of probability will be explained in section 1.15.
The graphical representation of the distribution of our chi-squares
appears in figure !.I3.1. In this kind of graph, called a histogram, the
frequencies are represented by the areas of the rectangular blocks in the
ngure. The graph brings out both the concentration of small chi-square
at the left and the comparatively large sizes of a few at the right. It is now
evident that for the medIcal example in section 1.11, X' = 4 is larger than
a great majority of the chi-squares in this distribution. If this disease were
in fact equally likely to result in male or female hospitalized cases, this
would be an unusually large value of chi-square.
1.I4-ComparisoD witb tbe theoretical distribution. Two features of
our chi-square distribution have yet to be examined: (i) How does it com-
pare with the theoretical distribution? and (ii) How can we evaluate more
exactly the probabilities of various chi-square sizes? For these purposes
a rearrangement of the class intervals is advisable. Since our primary
interest is in the relative frequency of high values of chi-square, we used
the set of class intervals defined by column 4 of table 1.14.1. The first three
intervals each contain 25% of the theoretical distribution. As chi·square
increases, the next four intervals contain respectively 15%, 5%, 4%, and

TABLE 1.14.1
COMPARISON OF THE SAMPLE AND TiU:ORETlCAl DISTRIBUTIONS OF Cm-SQl:ARE

SampJe Frequency Theoretical Frequency


Distribution Distribution

Cumulative
Clas,s Interval Per Cent
of Chi-square t\ctual Percentage Percentage X' Greater Than

I 2 3 4 ., 6

0-<).1015
0.1015-j).455
57
59
24.8
25.6
25
25
I 0
0.1015
100
75
0.455-1.323 62 27.0 25 0.455 50
1.323-2.706 32 13.9 IS \.323 25
2.706-3.841 14 6.1 5 2.706 10
3.841-6.635 3 1.3 4 3.841 5
6.635- 3 1.3 I 6.635 1
--
Total 230 100.0 100
.- --~-
26 Chopter I: Sampling 01 AlIril>ute.
I'%,. Since the theoretical distribution is known exactly and has been
widely tabulated. the corresponding class intervals for chi-square, shown
in column I, are easily obtained. Note that the intervals are quite unequal.
Column 2 of table 1.14.1 shows the actual frequencies obtained from
the 230 samples. I n column J, these have been converted to percentage
frequencies, by mUltiplying by 100/230, for comparison with the theoreti-
cal wrcentage frequencies in column 4. The agreement between columns
3 and 4 is good. If your chi-square values have been computed mostly
from sll}all samples of sizes 10, 15, and 20, your agreement may be poorer.
With small samples there is only a limited number of distinct values of chi-
square, so that your sample distribution goes by discontinuous jumps.
Columns 5 and 6 contain a cumulative frequency distribution of the
percentages in column 4. Beginning at the foot of column 6, each entry
is the sum of all the preceding ones in column 4, hence the name. The
column is read in this way: the third to the last entry means that 10%
of all samples in the theoretical distribution have chi-squares greater
than the 2.706. Again, 50'%, of them exceed 0.455; this may be looked
upon as an average value, exceeded as often as not in the sampling. Final-
ly, 'chi-squares greater than 6.635 are rare, occurring only once per 100
samples. So in this sampling distribution of chi-square we find a measure
in terms of probability, the measure we have been seeking to enable us
to say exactly which chi-squares are to be considered small and which
large. We are now to learn how this measure can be utilized.

LIS-The test of a nuD hypothesis or test of signiflCiiiice, As indicated


in section 1.10, the investigator's objective can often be translated into a
hypothesis about his experimental material. The genet1cist, you remem-
ber, knowing that the Mendelian theory of inheritance produced a 3: I
ratio, set up the hypothesis that the tomato population had this ratio of
red to yellow fruits. This is called a null hypothesis, meaning that there
is no difference between the hypothetical ratio and that in the population
of tomato fruits. If this null hypothesis is true, then random samples of
n will have ratios distributed binomially, and chi-squares calculated from
the samples will be distributed as in table 1.14.1. To test the hypothesis,
a sample is taken and its chi-square calculated; in the illustration the
value was 1.33. Reference to the table shows that, if the null hypothesis
is true, 1.33 is not an uncommon chi-square, the probability of a greater
one being about 0.25. As the result of this test, the geneticist would not
likely reject the null hypothesis. He knows, of course, that he may be in
error, that the population ratio among the tomato fruits may not be 3: J.
But the discrepancy, if any, is so small that the sample has given no con-
vincing evidence of it.
Contrasting with the genetic experiment, the medical example turned
up X' = 4. If the null hypothesis (this disease equally likely in men and
women) is true, a Jarger chi-square has II probability of only about 0.05.
This suggests that the null hypothesis is false, so the sampler would likely
27
reject it. As before, he may be in error because this might be one of those
5 samples per 100 that have chi-squares greater than 3.841 even when the
sampling is from an equally divided population. In rejecting the null
hypothesis, the sampler faces the possibility that he is wrong. Such is the
risk always run by those who test hypotheses and rest decisions on the
tests.
The illustrations show that in testing hypotheses one is liable to
two kinds of error. If his sample leads him to reject the null hypothesis
when it is true, he is said to have committed an error of the first kind, or a
Type I error. If, on the contrary, he is led to accept the hypothesis when
it is false, his error is of the second kind, a Type II error. The Neyman-
Pearson theory of testing hypotheses emphasizes the relations between
these types. For recent accounts of this theory see references (6,7,8).
As a matter of practical convenience, probability levels of 5~~ (0.05)
and 1% (0.0 I) are commonly used in deciding whether to reject the null
hypothesis. As seen fromlable 1.14.1, these correspond to 1. 2 greater
than 3.841 and X' greater than 6.635, respectively. In the medical exam-
ple we say that the difference in the number of male and female patients
i. significant at the 5% level, because it signifies rejection of the null
hypothesis of equal numbers.
This use of 5% and I % levels is simply a working convention. There
i. merit in the practice, followed by some investigators, of reporting in
parentheses the probability that chi-square exceeds the value found in
their data. For instance, in the counts of red and yellow tomatoes, we
found X' = 1.33, a value exceeded with probability about 0.25. The re-
port might read: "The X' test was consistent with the hypothesis of a
3 to I ratio of red to yellow tomatoes (P = 0.25)."
The values of X' corresponding to a series of probability levels.are
shown below. This table should be used in working the exercises/that
follow.
Probabilit.y of a Greater Value
_~-----------------.--.

p 0.90 0.75 0.50 0.25 0.10 0.05 .....\1,025 o.oto 0.005

:x_'_l_._0_.02_ _0_.1_0_ _0_.4_5_ _1.32 2.71 3.84 5.02 6.63 7.88

EXAMPLE I. J5.)-Two workers A and B perfonn a task in which carelessness leads to


minor accidents. In the first 20 accidents, 13 happened to A and 7 to B. Is this evidtnce
against the hypothesis that the two men are equally liable to accidents? Compute"r and
find the significance probability. Ans. Xl = t .8. Pbetween 0.10 and 0.25.
EXAMPLE 1.15.2--A baseball player has a litetime: batting average of 0.280. (This
me-<ms that the probability that he gets a hit when at bat is 0.280.) Starting a new season. he
gets t 5 hits in Ilis first 30 times at bat. Is this evidence that he is having what is called a hot
streak" Compute Xl for the null hypothesis that his probability of hitting is still 0.280. Ans.
l
7.. "", ;.20. P <: 0.01. Null hypothesis is rejected.

EXAMPLE l.lS.3-.'.....In some experiments on heredity in the tomato, MacArthur (5)


counted 3,629 fruits with red flesh and ),176 with yellow. This was in the F2 generatIOn
28 Chapter I: Sampling 01 Attribut••
wherethetheoret1ca\ ratio wa.~ 3: \. Compute '1. 2 = 0.71 and find the significance probability.
MacArthur concluded that "the discrepancies between the observed and expected ratios
are not significant."
EXAMPLE 1.15.4---10 a South Dakota farm 1abor survey of 1943, 480-of the 1,000
reporting farmers were classed as owners (or part owners), the remaining 520 being renters.
It is known that'ofnearly 7,000 farms in the region, 41% are owners. Assuming this to be
popu\ati.on percentage, cakulate chi-square and P for the 'Mlmp\e of 1,000. Ans.·t! = 0.4\,
P = 0.50. Does this increase your confidence in the randomness of the sampling? Such'
collateral evidence is often cited. The assumption is that irthe sample is shown to be repre-
sentative for one attribute it is more likely to be representative also of the attribute under
investigation, provjded the two are related.
EXAMPLE 1.15.s..-James Snedecor (4) tried the effcct of injecting poultry eggs with
female sex honnones. In one series 2 normal males were hatched together with 19 chicks
which were classified as either normal females or as individuals with pronounced female
characteristics. What is the probability of the ratio 2: 19, or one more extreme, in sampling
from a population with equal numbers. of the sex.es in which the oormone has no effect?
Ans. Xl = )3.76. P is much less than 0.01.
EXAMPLE 1.15.6---ln table 1.14.1, there are 62 + 32 + 14 + 3 + 3 = 114 samples
baving chi~squares greater than 0.455, whereas 50% or 230 were ex.pected. What is the prob-
ability of drawing a more discrepant sample if the sampling is truly random? Ans. Xl
= 0.0174, P = 0.90. Make the same test for your own samples.
EXAMPLE 1.15.7-This example illustrates the discontinuity in the distribution of
chi·square when computed from small samples. From 100 samples of size 10 drawn from the
random digits table A I, the following frequency distribution of the numbers of odd digits in
a sample was obtained.

Number of odd digits lor 9 2 or 8 3 or 7 4 or 6


Frequency 2 8 19 46

Compute the sample frequency distribution Of;(l as in tabl' 1.14.1 and compare it with the
theoretical distribution. Obser:ve that no sample..,l occurs in the class interval 0.455-1.323,
although 25% of the theoretical distribution lies in this range.

1.16-Tests of significance in practice. A test of significance is some-


times thought to be an automatic rule for making a decision either to
"accept" or "reject" a null hypothesis. This attitude should be avoided.
An investigator rarely rests his decisions wholly on a test of significance.
To the evidence of the test he adds knowledge accumulated from his own
past work and from the work of others. The size of the sample from which
the test of significance is calculated is also important. With a small sam-
ple, the test is likely to produce a significant result only if the null hypothe-
sis is very badly wrong. An investigator's report on a small sample test
might read as follows: "Although the deviation from the null hypothesis
was not significant, the sample is so small that this result gives only a
weak confirmation of the null hypothesis." With a large sample, on the
other hand, small departures from the null hypothesis can be detected
as statistically significant. After comparing two proportions in a large
sample, an investigator may write: "Although statistically significant,
the difference between the two proportions was too small to be of practical
importance, and was ignored in the subsequent analysis."
29
In this connection, it is helpful, when testing a binomial proportion
at the 5% level, to look at the 95% confidence limits for the population p.
Suppose that in the medical example the number of patients was only
n = 10, of whom 4 were female, so that the sample proportion of female
patients was 0.4. If you test the null hypothesisp = 0.5 by X2 , you will find
X2 = 0.4, a small value entirely consistent with the null hypothesis.
Looking now at the 95% confidence limits for p, we find from table 1.4.1 (p.
000) that these are 15% and 74%. Any value of the population plying
between 15% and 74% is also consistent with the sample result. Clearly,
the fact that we found a non-significant result when testing the null hy-
pothesis p = 1/2 gives no assurance from these data that the true p is
1/2 or near to 1(2.
1.17-Summary of technical terms. In this chapter you have been
introduced to some of the main ideas in statistics, as well as to a number of
the standard technical terms. As a partial review and an aid to memory,
these terms are described again in this section. Since these descriptions
are not dictionary definitions, some would require qualification from a
more advanced viewpoint, but they are substantially correct.
Sta/is/ies deals with techniques for collecting, analyzing, and drawing,
conclusions from data.
A sample is a small collection from some larger aggregate (the
population) about which we wish information.
Statistical inference is concerned with attempts to make quantitative
statements about properties of a population from a knowledge of the
results given by a sample.
Allribute data are. data that consist of a classification of the members
of the sample into a limited number of classes on the basis of some
property of the members (for instance, hair color). In this chapter, only
samples with two classes have been studied.
Measurement data are data recorded on some numerical scale. They
are called discrete when only a restricted number of values occurs (for
instance, 0, 1,2, ... J I children). Strictly, all measurement data are dis-
crete, since the results of any measuring process are recorded to a limited
number of figures. But measurement data are called continuous if, con-
ceptually, successive values would differ only ))y tiny amounts.
A point estimate is a single number stated as all estimate of some quan-
titative property of the population (for instance, 2.7% defective articles,
58.300 children under five years). The quantity being estimated is often
called a population parameter.
An interval estimate is a statement that a population parameter has
a value lying between two specified limits (the population contains be-
tween 56,900 and 60.200 children under five years).
A confidence inrefl'ai is one type of interval estimate. It has the fea-
ture that in repeated sampling a known proportion (for instance, 95%)
of the intervals computed by this method will include the population
parameter.
30 Chop,., I: Sampling 01 AHrib_
Random sampling, in its simplest form, is a method of drawing a
sample such that any member of the population has an equal chance of
appearing in the sample, independently of the other members that happen
to fall in the sample.
Tables of random digits are tables in which digits 0, I, 2, ... 9 have
been drawn by some process that gives each digit an equal chance of
being selected at any draw.
The sampled population is the population of which our data are a
random sample. It is an aggregate such that the process by which we
obtained our sample gives every member of the aggregate a known chance
of appearing in the sample, and is the popUlation to which statistical
inferences from the sample apply. In practice, the sampled population is
sometimes hypothetical rather than real, because the only available data
may not have been drawn at random from a known population. In
meteorological research, for instance, the best data might be weather
records for the past 40 years, which are not a randomly selected sample
ofyears.
The target population is the aggregate about which the investigator
is trying to make inferences from his sample. Although this term is not
in common use, it is sometimes helpful in focussing attention on differ..
ences between the population actually sampled and the popUlation that
we are attempting to study.
In afrequency distribution, the values in the sample are grouped into
a limited number of classes. A table is made showing the class boundaries
and the frequencies (number of members of the sample) in each class.
The purpose is to obtain a compact summary of the data.
The binomial distribution gives the probabilities that 0, 1, 2.... n
members of a sample of size n will possess some attribute, when the sample
is a random sample from a population in which a proportion p of the
members possess this attribute.
A null hypothesis is a specific hypothesis about a population that is
being tested by means of the sample results. In this chapter the only hy-
pothesis considered was that the proportion of the population having some
attribute has a stated numerical value.
A test of significance is, in general terms, a calculation by which the
sample results are used to throw light on the truth or falsity of a null
hypothesis. A quantity ca!fed a test criterion is computed: it measures
the extent to which the sample departs from the null hypothesis in some
relevant aspect. If the value of the test criterion falls beyond certain
limits into a region of rejection, the' departure is said to be statistically
significant or, more concisely, significant. Tests of significance have the
property that if the null hypothesis is true, the probability of obtaining a
significant result has a known value, most commonly 0.05 or 0.01. This
probability is the significance level of the test.
Chi-square = I:(Observed - Expected)2/(Expected) is a test criterion
for the null hypothesis that the proportion with some attribute in the
31
population has a specified value. Large values of chi-square are signifi-
cant. The chi-square criterion serves many purposes and will appear
later for testing other null hypotheses.
Errors of the first and second kinds. In the Neyman-Pearson theory
of tests of hypotheses, an error of the first kind is the rejection of the null
hypothesis when it is true, and an error of the second kind is the acceptance
of a null hypothesis that is false. In practice, in deciding whether to re-
ject a null hypothesis or to regard it as provisionally true, all available
evidence should be reviewed as well as the specific result of the test of
significance.
REFERENCES
I. The confidence intervals for sample sizes up to n = 30 were taken from the paper by
E. L. Crow. Biometrika, 43, 42J.-.-435 (J956). Intervals for n greater than 30 were
obtained from the normal approximation as discussed in 'iection 8.7.
2. RAND CORPORATION. A Million Random Digits With 100,000 Normal Deviates, Free
Press, Glencoe, IlL (1955).
3. K. PEARSON. Phil. Mag .• Sec. 5. 50: 157 (1899).
4. J. G. SNEDECOR. J. Exp. Zool.. 110:205 (1949).
5. J. W. MACARTHUR. Trans.·Roy. Canadian Insl., 18: 1 (1931).
6. P. G. HOl:l.. Imroduction 10 Marhemutical SJalislin, 2nd ed., Chap. 10. Wiley. New
York (1954).
7. E. S. KEEPING.Introdu("fion to Stuti.Slicallnferena, Chap. 6. Van Nostrand. Princeton,
N.J. (19621.
8. H. FREEMAN. Introduction /0 Statistical Inference. Chap. 28. Addison~Wesley, Reading,
Mass. (196J).
* CHAPTER TWO

Sampling from a normally


distributed population

2.I~NormalJy distributed population. In the first chapter. sampling


was mostly from a population with only two kinds of individuals; odd or
even, alive or dead, infested or free. Random samples of n from such a
population made up a binomial distribution. The variable. an enumera-
tion of successes, was discrete. Now we turn to another kind of population
whose individuals are measured for some characteristic such as height or
yield or income. The variable flows without a break from one individual
to the next-a continuous variable with no limit to the number of indi-
viduals with different measurements. Such variables are distributed in
many ways, but we shall be occupied first with the normal distribution.
Next to the binomial. the normal distribution was the earliest to be
developed. De Moivr. published its equation in 1733, twenty years after
Bernoulli had given a comprehensive account of the binomial. That the
two are not unrelated is clear from figure 2.1. I. On the top is the graph
of a symmetrical binomial distribution similar to that in figure 1.9.1. In
this new figure the sample size is 48 and the population sampled has equal
numbers of the two kinds of individuals. Although discrete. the binomial
is here graphed as a histogram. That is, the ordinate at 25 successes is
represented by a horizontal bar going from 24.5 to 25.5. This facilitates
comparison with the continuous normal curve. An indefinitely great
number of samples were drawn so that the frequencies are expressed as
percentages of the total. Successes less than 13 and more than 35 do occur.
but their frequencies are so small that they cannot be shown on the graph.
Imagine now that the size of the sample is increased without limit. the
width of the intervals on the horizontal axis being decreased correspond-
ingly. The steps of the histogram would soon become so small as to look
like the continuous curve at the right. Indeed. De Moivre discovered the
normal d;stribution when seeking an approximation to the binomial. The
discrete variable has become continuous and the frequencies have merged
into each other without a break.
This normal distribution is completely determined by two constants
or parameters. First, there is the meun, p, which locates the center of the
distribution. Second. the standard deviation, (1, measures the spread or
32
33

I~"'.

...
,..
C
::I

1 10"-

!
§..

~
~"-

Hum_ 01 Suec,.."

FIG. 2.U-Upper : binomial distribution of Successes in samples of 48 from I : I popula-


tiOD. Lower : normal distribution with mean jI. ;lnd standard deviation C7 ; tbe shaded areas
comprise S~" of the tolal.
-v·'
---- v · " .5

.4
>.
u
; .3
::t
cr
...
.a
l&.. .2
., ",
.!!
~., . 1 " ,,
a: " ......
O~~--~~~~--~--~O--~L_--~~~3-"'--~4---
Volue of
FIG. 2.1.2- Solid curve: the normal distribution wilh p - 0 and t1 - 1. Dotted
curve : lhe normal distribution wilh II = 0 and t1 - 1.5.

variation of the individual measurements; in fact. 0 i, the JCQ'e (unit of


measurement) of the' variable which is normally distributed.
From the figure you see that within one sigma on either side.of II- the
frequency is decreasing ever more rapidly but beyond that point it de-
creases at a continuously lesser rate. By the time the variable, X. has
reached ± 30' the percentage frequencies are negligibly small. Theoret-
ically, the frequency of occurrence never v.aoisbes entirely, but it ap-
proaches zero as X increases indefinitely. The concentration of the
measurements close to 'p. is emphasized by the fact that over 2/3 of tbe
observations lie in the interval )J. ± 0' while some 95% of them are in the
interval II- ± 20'. Beyond ±30' liesl)nJy 0.26% of the total frequency.
The formula for the ordinate or height of the normal curve is
1 _ (X - ,.)1/2.~
y=--e •
O'.j'i'lt
where the quantity e = 2.3026 is the base for natura1logarithms and 'It is
of course 3.141~. To illustrate the role of the standard deviation 0' in
determining the shape of the curve, figure 2.1.2 showlt two curves. The
solid curve has II- = 0, 0' = 1. while tbe dotted curve has II- = 0. 0' .:: 1.5.
The curve with the larger 0' is lower at tbe mean and more spread out.
Values of X that are far from the mean are much more frequent with
u = 1.5 than with 0' = I. In other words. the population is more variable
with CJ = 1.5. A curve with f1 = 1/2 would have a height of nearly 0.8 at
the mean and would have scarcely any frequency beyond X = 1.5.
35
To indicate the effect of a change in the mean 1'. the curve with I' = 2.
cr = I is obtained by lifting the solid curve bodily and centering it at
X = 2 without changing its shape in any other way. This explains why I'
is callec the parameter of location.
2.2-Reasons for the use of the normal distribution. You may be
wondering why such a model is presented since it obviously cannot de-
scribe any real population. It is astonishing that this normal distribution
has dominated statistical practice as well as theory. BrieBy. the main
reasons are as follows:
I. Convenience certainly plays a part. The normal distribution has
been extensively and accuralely tabulated. including many auxiliary re-
sults that Bow from it. Consequently if it seems to apply fairly well to a
problem. the investigator has many time-saving tables ready at hand.
2. The distributions of some variables are approximately normal.
such as heights of men. lengths of ears of com. and, more generally. many
linear dimensions, for instance those of numerous manufactured articles.
3. With measurements whose distributions are not normal, a simple
transformation of the scale of measurement may induce approximate
normality. The square root. J X, and the logarithm, log X, are olien
used as transformations in this way. The scores made by students in
national examinations are frequently rescaled so that they appear to fol-
Iowa normal curve.
4. With measurement data. many investigations have as their purpose
the estimation of averages-tbe average life of a battery, the average in-
come of plumbers, and so on. Even if the distribution in the original
population is far from normal, the distribution of sample averages tends
to become normal, under a wide variety of conditions, as the size of
sample increases. This is perhaps the single most important reason for the
use of the normal.
5. Finally. many results that are useful in statistical work. although
strictly true only when the population is normal. hold well enough for
rough-and-ready use when samples come from non-normal populations.
When presenting such results we shall try to indicate how well they stand
up under non-normality.
2.~Tables of the annual distribution. Since the normal curve de-
pends on the two parameters I' and cr, the;:e are a great many different
normal curves. All standard tables of this distribution are for the dis-
tribution with I' = 0 and (J = I. Consequently if you have a measurement
X with mean Il and standard deviation (J and wish to use a table of the
normal distribution, you must rescale X so that the mean becomes 0 and
the standard deviation becomtlS I. The rescaled measurement is given
by relation
X-I'
Z=--
(J

3
36 Chopt.. 2: Sampling From " Normally Distn'buteJ Popu/<JIion
The quantity Z goes by various names~a standard normal variate, a
standard normal deviate, a normal variate in standard measure, or, in educa-
tion and psychology, a standard score (although this term sometimes has
a slightly different meaning). To transform back from the Z scale to the
X scale, the formula is

X = JJ + I1Z
There are two principal tables.
Tah/e a/ordinates. Table A 2 (p. 547) gives the ordinates or heights of the
standard normal distribution. The formula for the ordinate is

These ordinates are used when graphing the normal curve. Since the
curve is symmetrical about the origin, the heights are presented only for
positive values of Z. Here is a worked example.
EXAMPLE J -Suppose that we wish to sketch the normal curve for a variate X that
has Jl "'" 3 and (1.= 1.6. What is the height of this curve at X == 1?
Swp 1. Find Z ~ (2 - 3)11.6 ~ -0.62S.
SI~P 2. Read the ordinate in table A 2 for Z = 0.625. In the table, the Z entries afe given
to two decimal places only. For Z = 0.62 the ordinate isO.32.92and for Z == O.63theordinate
is 0.3211. Hence we take 0.320. fOf Z = 0.625.
Seep 3. Finally, divide the ordinate 0.328 by U, getting 0.328/1.6 = 0.205 as the answer.
This step is needed because if you look back at the formula in section 2, I for the ordinate
of the general normal curve, you will 'lee a P' in the denominator that does not appear in the
tabulated curve.
Table of the cumulatire distribution. Table A 3 (p. 548) is much more
frequently used than Table A 2. This gives, for any positive value of Z,
the area under the curve from the origin up to the point Z. It shows, for
aay positive Z, the probability that a variate drawn at random from the
standard normal distribution will have a value lying between 0 and Z.
The word cumulative is used because if we think of the frequency dis-
tribution of a very large sample, with many classes, the area under the
curve represellts the total or cumulative frequency in all classes lying be·
tween 0 and Z, divided by the total sample size so as to give a cumulative
relative frequency. In the limit, as the sample size increases indefinitely,
this becomes the probability that a randomly drawn member lies between
Oand Z.
As a reminder the area tabulated in Table A 3 is shown in figure 2.3.1.
Since different people have tabulated different types of area under the
normal curve, it is essential, when starting to use any table, to understand
clearly what area has been tabulated.
First. a quick look at table A 3. At Z = 0 the area is, of course, zero.
At Z = 3.9, or any larger value, the area is 0.5000 to four decimal places.
It follows that the probability of a value of Z lying between - 3.9 and
31

o
FIG. 2.3. I -The shaded area is the a!'ell Iilbuhlled In lable A 3 for posnlve YIIlues of Z

+ 3.9 is 1.0000 to four decimals. remembering that the curve is sym-


metrical about the origin. This means that any value drawn from a stan-
dard normal distribution is practically certain to lie between - 3.9 and
+ 3.9. At z,. 1.0. the area is 0.3413. Tbus the probability of a value
lying between - 1 and + 1 is 0.6826. This verifies a previous remark
(section 2.1) that over 2/ 3 of the observations in a general nonnal distribu-
tion lie in the interval p. ± a. Similarly, for Z = 2 the area is 0.4772, cor-
re<;ponding to the resuJt that a bout 95% of the observa tions (more ac-
curately 95.44%) will lie between Jl - 2a and Jl + 2a.
Wben using table A 3 you will often want probabilities represented
by areas different from those tabulated. If A is the area in table A 3, the
following table shows bow to obtain the probabilities most commpnly
needed.
TABLE 2.3. 1
FORlIULAS FOa F1NDING Paoa.uu.mIS 1lEr..AT1iD TO THE NOIUIAL DJSnIBUT10N

Probability of . Value Formula

(I) Lyiaa between 0 and Z A


(2) Lyilll bctweeD - Z and Z 2,4
(3) Lyioa outaide the interval ( - Z , Z) J- U
(4) Las thaD Z (Z positive) 0.5 + A
(5) Less thaD Z (Z neptive) 0.5 - A
(6) Greater than Z (Z positive) 0.5 - A
(7) Greater thaD Z (Z neptive) 11.5 + .4

Verification of these formulas is left as an exercise. A few morc


complex examples will be worked :
EXAMPLE 1-What is the probability that I nonnal deviate lies bctwectl -1.62 and
+0.28? We have to lpIit the interva l into two parts : from -1.62 to O. alld from 0 to 0.28.

.s
From table A 3, tbe a reas for the two peru are. respectively, 0.4474 a nd O. 1103. livinl 0.5577
the answer.
38 Chapl.r 2: Sampling from a Normo"y Dislribut.J Populotion
EXAMPLE 3-What is the probability that a normal deviate lies between -2.67 and'
-O.59? In this case we take the area from - 2.67 to 0, namely 0.4912, and subtract from it the
area from - 0.59 to 0, namely 0.2224. giving 0.2748.
EXAMPL£4-The heights of a large sample afmeo were found to be approximately
normally distributed with mean = 67.56 inches and standard deviation = 2.57 i\'lChes.
What proportion of the men have heights less than 5 feet 2 inches? We must first find Z.

Z ~ X - # _ 62 - 67.56 -2.163
11 2.57

The probability wanted is the probability of a value less. than Z. where Z is negative. We
use formula (S)jn table 2.3.1. Reading labJeA 3 at Z '= 2.163, we get A = 0.4847, interpolat-
ing mentally between Z = 2.16 and Z = 2.17. From formula (5), the answer is 0.5 - A,
or 0.01 53. 'About 11% of the men ha\'e heights less than 5 ft . .2 in.

,
EXAMPLE 5-What height is exceeded by 5% of the men? The first step is to find Z
we use formuJa (6) in table 2.3.1, writjng 0.5 - A = 0.05, so that A = 0.45. We now look
in table A 3 for the value of'Z such that A = 0.45. The value is Z = 1.645. Hence the actual
height is
X_ # + uZ - 67.56 + (2.57)(1.645) .. 71. 79 inches.
just under 6 feet.
Some examples to be worked by the reader follow:
EXAMPLE 2.3. I-Using tabl-e A 2. (i) at rheorigiu. what is the ~ight ofa normal curve
with (1 = 2? (ii) for any normal curve, at what value of X is the height of the curve one-tenth
of the height at the origin? Ans. (i) 0.1994; (ii) at the val\le X = JI. + 2.15a.
EXAMPLE 2.3.2-Using table A 3, show that 92.16% of the items in a normally dis-
tributed population lie between - 1. 76a and + 1.76".
EXAMPLE 2.3.3---Show that 65.24% of the items in a nonnal population lie between
p-l.la andp. + 0.8(1.
EXAMPLE 2.3.4-Show that 13.59% oft~ items lie between Z = J and Z = 2.
EXAMPLE 2.3.5-Sbow that half the population lies in the interval from JI. - 0.6745a
and JI. + 0.6745a. The deviation 0.6745(1, formerly much used, is called the probable error
of X. Ans. You will have to use interpolation. You are lOeeking a value of Z stich that the
area from 0 to Z is 0.2500. Z = 0.67 gives 0.2486 and Z = 0.68 gives 0.2517. Since 0.2500
- 0.2486 = 0.0014. and 0.2517 - 0.2486 "'" 0.0031. we need to go 14(31 of the distance
from 0.67 to 0.68: Si~e 14/31 = 0.45. the interpolate is Z = 0.6745.
EXAMPLE 2.3.6--Show that 1% of the population lies outside the limits Z = ± 2.575.
EXAMPLE 2.3.7-For the heights of men, with p = 67.56 inches and (1 = 2.57 inches,
what percentage of the population has heights lying between 5 feet 5 inches and S feet 10
inches? Compute your Z's to two decimals only. Ans. 67%.
EXAMPLE 2.3.8-The specification for a manufactured component is that the pres-
stlre at a certain point must not exceed 30 pounds. A manufacturer who would like to enter
this market finds that he can make components with a mean pressure JI. = 28 Ibs., but the
pressure varies from one specimen to another with a standard deviation (1 = 1.6 lbs. What
proportion of his specimens will fail to meet the specification? Ans. 10.6%.
EXAMPLE 2.3.9--By quality control methods it may be pos!:ible to reduce a in the
previous example while keeping p. at 28 lbs. If the manufacturer wishes only 2~~ of his
specimens to be rejected. what muSt rosa be? ADs. 0.98 lbs.
'9
2.4--Estimators of I' and ". While I' and" are seldom known, they
may be estimated from random samples. To illustrate the estimation of
the parameters, we turn to the data reported from a study. In 1936 the
Council on Foods of the American Medical Association sampled the
vitamin C content of commercially canned tomato juice by analyzing a
specimen from each of the 17 brands that displayed the seal of the Council
(I). The vitamin C concentrations in mg. per 100 gm. are as follows
(slightly altered for easier use):
16,22,21,20,23,21;19, !5, 13,23, 17,20,29, 18,22, 16,25
Estiryalion of fl. Assuming random sampling from a normal popula·
tion, I' is estimated by an average calIed the mean of the sample or, more
briefly, the sample mean. This is calculated by the familiar process of
dividing the sum of the observations, X, by their number. Representing
the sample mean by X',
X = 340/17 = 20 mg. per 100 grams of juice
The symbol, X is often called "bar-X" or "X-bar." We say that this
sample mean is an estimator of I' or that I' is estimated by it.
Estimation of G. The simplest estimator of" is based on the range of
the sample observations, that is, the difference between the largest and
smallest measurements. For the vitamin C data,
range = 29 - 13 = 16 mg./IOO gm.
From the range, sigma is estimated by means of a multiplier wbich de-
pends on the sample size. The multiplier is shown in tbe column headed
''a/Range'' in table 2.4.1 (2,3). For /I = 17, halfway between 16 and 18,
tbe multiplier is 0.279, so tbat
rI is estimated by (0.279)(16) = 4.46 mg./lOO gm.
Looking at table 2.4.1 you will notice tbat tbe multiplier decreases as
n becomes larger. This is because the sample range tends to increase as
the sample size increases, although tbe population rI remains unchanged.
Clearly if we start with a sample of size 2 and keep adding to it, the range
must either stay constant or go up with each addition.
Quite easily, then, we have made a point estimate of each parameter of
a normal population; these estimators constitute .summary of the infor·
mation contained in the sample. The sample mean cannot be improved
upon as an estimate of 1', but we shall learn to estimate rI more-efficiently.
Also we shall Ieam about interval estimates and tests of hypotheses. Be-
fore doing so, it is worthwhile to examine OUr sample in greater detail.
The first point to be clarified is this: What popUlation was repre-
sented by the sample of 17 determinations of vitamin C? We raised this
question tardily; it is the first one to be considered in analyzing any sam·
piing. The report makes it clear that not all brands were sampled, only
tbe seventeen allowed to display the seal of the Council. Tbe dates of the
40 Chapler 2: Sampling From Q Normally Distributed Population
TABLE 2.4.1
RATIO OF a TO RANGE IN SAMPLES Of n FROM THE NORMAL DISTRIBUIlON. EFfiCIENCY
OF RANGE AS ESTIMATOR OF (1, NVMBER OF OBSERVATIONS WITH
RANGE To EQUAL 100 WrrH S

• Relative Number • Relative Number


n Range Efficiency per 100 n Range Efficiency per 100

2 0.886 1.000 100 12 0.307 O.8iS 123


3 .591 0.992 101 14 .204 .78) !28
4 .486 .975 103 16 .283 .753 133
5 .430 .955 105 18 .275 .726 138
6 .395 .933 107 20 .268 .700 143
7 .370 .912 110 30 .245 .604 166
8. .351 .890 112 40 .231 .536 186
9 .337 .869 115 50 .222 .49 204
10 .325 .850 118

packs were mostly August and September of 1936. about a year before the
analyses were made. The council report states that the vitamin concentra-
tion "may be expected to vary according to the variety of the fruit, the
conditions under which the crop has been grown, the degree of ripeness
and other factors." About all that can be said. then. is that the sampled
popUlation consisted of those year-old containers still available to the 17
selected packers.
2_5-The array and its graphical representation. Some of the more
indmate features of a sample are shown by arranging the observations in
order of size, from low to high. in an array. The array of vitamin contents
is like this:
13,15.16.16,17.18,19.20,20.21.21.22.22.23•.23.25.29
For a small sample the array Serves some Qfthe same purposes as the fre-
quency distribution of a large one. .
The range, from 13 to 29. is now obvious. Also. attention is attracted
to the concentration of the measures near the center of the array and to
their thinning Qut at the extremes. In this way the sample may reflect the
distribution of the nQrmal popUlation from which it was drawn. But the
smaller the sample. the more erratic its reflection may be.
In looking through the vitamin C contents of the several brands, one is
struck by their variation. What are the causes of this variation') Different
processes of manufacture. perhaps. and different sources of the fruit.
Doubtless. also. the specimens examined. being themselves samples of
their brands. differed from the brand means. Finally. the laboratory
technique of evaluation is never perfectly accurate. Variation is the
essence of statistical data,
Figure 2.5.1 is a graphical representation of the foregoing array of 17
vitamin determinations. A dot represents each item. The distance of the
41
20

,..
~
-
.
0:

!!:
13

z
0
to
~ 10
-
....
0
0:

'"
ID
2
"5
i· 20
:::>
z

0
S 10 15
I
20 25 29
VITAMIII-C MILLIGRAMS PER lOa GRAMS
FIG. 2.5.I-Graphical represent;ttion of an array. Vitamin C data.

dot from the vertical line at the left, proportional to the concentration
of ascorbic acid in a brand specimen, is read in milligrams per 100 grams
on Ihe horizontal scale. '
The diagram brings out vividly not only the variation and the con-
c.entralion in the sample, but also two other characteristics: (i) the rather
symmetrical occurrence of the values above and below the mean, and
(ii) the scarcity of both extremely small and extremely large vitamin C
contents, the bulk of the items being near the middle of the set. These
features recur with notable persistence in samples from normal distribu-
tions. For many variables associated with living organisms there are
averages and ranges peculiar to each, reflecting the manner in which each
seems to express itself most successfully. These norms persist despite the
fact that individuals enjoy a considerable freedpm in development. A
large part of our thinking is built around ideas corresponding to such
statistics. Each of the words, pig, daisy, man,. raises an image which is
quantitatively described by summary numbers. It is difficult to conceive
of progress in thought until memories of individuals are collected into
concepts like averages and ranges of distributions.

2.6-Algebraic ootatiOll. The items in any set may be represented by

where the subscripts I, 2, ... n, may specify position in the set of /I items
(not necessarily an array). The three dots accompanying these symbols
42 Chapt., 2: SompJ.., From a NomtaIIy DistribufeJ Popu/atioft
are read "and so on." Matching the symbols with the values in section 2.4,
Xl = 16, X, = 22, ...
Xl' = 25 mg';IOO gm.
The sample mean is represented by X, so that
X = (Xl + X, + ... X.l/n
This is condensed into the form,
X = (EX)/n,

where X stands for every item successively. The symbol, £X, is read
"summation X" or "sum of the X.". Applying this formula to the vitamin
C concentrations,
LX = 340, and X = 340/17 = 20 mg./IOO gm.
2.7-Deviations from sample mean. The individual variations of
the items in a set of data may be well expressed by the deviations of these
items from some centrally located number such as the sample mean.
For example, the deviation-from-mean of the first X-value is
16 - 20 = -4 mg. per 100 gm.
That is, this specimen falls short of X by 4 mg.!1 00 gm. Of special interest
is the whole set of deviations calculated from the array in section 2.S:
-7, -5, -4, -4, -3, -2. -1,0,0,1,1,2,2,3,3,5,9
These deviations are represented graphically in figure 2.5.1 by the dis-
tances of the dots from the vertical line drawn through the sample mean.
Deviations are almost as fundamental in our thinking as are averages.
"What a whale of a pig" is a metaphor expressing astonishment at the
deviation of an individual's size from the speaker's concept of the normal.
Gossip and news are concerned chiefly with deviations from accepted
standards of behavior. Curiously, interest is apt to center in departures
from norm, rather than in that background of averages against which the
departures achieve prominence. Statistically, freaks are freaks only
because of their large deviations.
Deviations are .(epresented symbolically by lower case letters. That
is:

Xl = Xl - X
x,=X,-X

x" = XII - X
43
Just as X may represent any of the items in a set, or all of them in succes-
sion, so x represents deviations from sample mean. In general,
x=X-X
It is easy to prove the algebraic result that the sum of a set of de-
viations from the mean is zero; that is, Ix = O. Look at the set of de-
viationsx, =X, -X,andsoon(footofp.42). Insteadofaddingthecol-
umn of values Xi we can obtain the same result by adding the column of
values Xi and subtracting the sum of the column of values X. The sum of
the column of values Xi is the expression :EX. Further, since there are II
items in a column, the sum of the column of values X is just nX. Thus we
have the result
:Ex = IX - nX
But the mean X = IX/n, so that nX = IX, and the right-hand side is
zero. It follows from this theorem that the mean of the deviations is also
zero.
This result is useful in proving several standard statistical formulas.
When it is applied to a specific sample of data, Ihere is a slight snag. If
the sample mean X does not come out exactly, we have to round it. As a
result of this rounding, the numerical sum of tbe deviations will not be
exactly zero. Consider a sample witb the values I. 7, 8. Tbe mean is
16/3, whicb we might round to 5.3. The deviations are tben -4.3, + 1.7
and + 2.7, adding to + 0.1. Thus in practice the sum of the deviations is
zero, apart from rounding errors.
EXAMPLE 2.7.1-The weights of 12 staminate hemp plants in early April at Colle&e.
Station. Texas (9). were approximately;
IJ. I I. 16,5.3,18,9.9,8,6,27, and 7 grams
Array the weights and represent them graphically. Calculate the sample mean. II gram$.
and the deviations therefrom. Verify the fact that I:x = O. Show that (J is estimated by 7.4
grams.
EXAMPLE 2.7.2-The heights of II men are 64.70.65,69.68,67.68.67.66.7'2 and
61 inches. Compute the sample mean and verify it by summing the deviations. Are thl?
numbers of positive and negative deviations equal. or only their sums?
EXAMPLE 2.7.3---The weights of II forty-year-ola.men were 148. 154. 158. 160. 161.
162. 166. 170. 182, 195. and 236 pounds. Notice the fact that only three of the weights
eJl.c~d the sample mean. Would you expect weights of men to be normally distributed ~

EXAMPLE 2. 7.4--1n a sample of 48 observations you are told that the standard devia-
tion has been computed and is 4.0 units. Glancing through the data. you notice that the
lowest observation is 39 and the highest 76. Does the reported standard deviation look
reasonable?
EXAMPLE 2.7.5- Tcn patients troubled with sleeplessness each received a nightly
dose ora sedative for one period. while in another period they received no sedative (4). The
average bours of sleep per night for each patient during each two-week period are as follows:
44 Chapter 2: Sampling From a Normally Distributed Population

Patient 2 3 4 6 7 8 9 10

Sedative I.J 1.1 6.2 3.6 4.9 14 6.6 4.5 4.3 6.1
None 0.6 l.l 2.5 2.8 2.9 3.0 3.2 4.7 5.5 6.2

Calculate the 10 differences. (Sedative - None). Might these differences be a sample from
a normal population of differences? How would you de~cribe this population': (You might
want to ask for mor..:: information.) Assummg that lhc differences afe normally distributed.
estimate J1 and (J' for the population of differences. Ans. +O. 75 hOllfS and I. 72 hour~,
EXAMPLE 2.7.6-If you have two sets of data that are paired as in the preceding
example, and if you have calculated the resulting set of differences. prove algebraically that
the sample mean of the differences is equal to the difference between the sample means of the
two sets. Verify this result for the data in example 2.7.5.
2.8--Another estimator of a; the sample standard deviation. The
range. dependent as it is on only the two extremes in a sample. usually has
a more variable sampling distribution than an estimator based on the
whole set of deviations-from-mean in a sample. not just the largest and
smallest. What kind of average is appropriate to summarize these devia-
tiuns, and to estimate a with the least sampling variation?
Clearly, the sample mean of the deviations is useless as an estimator
because it is always zero. But a natUIal suggestion is to ignore the signs,
calculating the sample mean of the absolute values of the deviations. The
resulting measure of variation, the mean absolute deriation, had a consider~
able vogue in times past. .Now. howc\'er, we use another estimator. more
efficient and more flexihle.
The sample standard deviation. This estimator. denoted by s. is the most
widely used in statistical work. The formula defining s is

s= /};(X _- X)' = I Lx'


\j 11-1 ~
"-
First. each de\'iatinnis sqllar~d. Next. the slim orsquare~. rx 2 • is divided
hy (n - I). one less than the sample size. The result is the mean square
or sample rariance. S2. Finally. the extraction of the square root recovers
the original scale of measurement. For the vitamin C concentrations. the
calculations are set out in the right-hand par! of table 2.8.1. Since the
sum of squares of the deviations is 254 and n is 17. we have

,'~ 254/16 = 15.88

5 = )15:88 = 3.98 mg./IOO gill.


Before further discussion of s is given. irs calculat ion should be fixed
in mind by working a couple of examples. Table A 18 is.. 1ahle of 'quare
rootS. Hints on finding square roots are giv~n on p. 541.
45
TABLE 2.8 I
C AL(,ULA TION m THf SAMPl_f S r ",.,OARD f)Evr",TlON

Observation Vitamin C Concentration Devitttion Deviation


Numb~r Mg. Per 100 gm. From Mean Squared

" X x=X-x x'


I 16 4 16
2 22 +2 4
3 21 + I I
4 20 0 0
5 23 + ) 9
6 21 + I I
7 19 - I I
8 15 5 25
9 13 7 49
10 23 + 3 9
II 17 3 9
12 20 0 0
13 29 + 9 81
14 18 2 4
15 22 + 2 4
16 16 - 4 16
17 25 +5 25

Totals 340 -26 +26 254

EXAMPLE 2.8.1----ln five patients with pneumonia, treated with sodium penicillin G,
the numbers of days required to bring the temperature down to normal were 1.4 5, 7, 3.
Compute 5 for these data and compare it with the estimate based Qn the range. Ans. s = 2.24
days. Range estimate = 2.58 days.

EXAMPLE 2.8.2-- Calculate s for the hempp\ant weights in example- 2.7.1. Am. 6.7
grams. Compare with your first estimate of u.
The appearance of the divisor (n - I) instead of n in computing "
and s is puzzling at firsl sight. The reason cannot be explained fully at
this stage, being related to the computation of s from data of mor" com:
plex structure. The quantity (11 - I) is called Ihe numher oj degrees of
jiwdom in s. Later in the book we shall meet situations in which the
number of degrees of freedom is nt.!ither II nor (1l - I), but some other
quantity. If the practice of using the degrees of freedom as divisor is fol-
lowed. there is the considerable advantage tha!. the same statistical tables,
needed in important applications, serve for a wide variety of types of data.
Division by (n - I) has one standard property that is often cited. If
random samples are drawn from allY indefinitely large population (not
just a normally distributed one) that has a finite value of", then the average
value of S2, taken over all random samples. is exactly equal to (12. Any
estimate whose average value over all possible random samples is equal
to the popUlation parameter being estimated is called unbiased. Thus,
46 Chap'er 2: Smtpling From 0 Normolly Dillribulea Population
S2 is an unbiased estimate of (J2. This property, which says that on the
average the estimate gives the correct answer, seems a desirable one for an
estimate to possess. The property, however. is not as fundamental as
one might think, because s is not an unbiased estimate of (J. If we want
s to be an unbiased estimate of (J in normal populations, we must use a
divisor that is neither (n - I) nor n.

2.9-Comparison ofthe two estimate", of (J. You now have two esti-
mators of (J, one ofthem easierlO ca!culatethan the other. but less efficient.
You need to know what is meant by "less efficient" and what governs the
choice of estimate. Suppose that we draw a large number of random
samples of size 10 from a normal population. For each sample we can
compute the estimate of (J obtained from the range, and the estimate s.
Thus we can form two frequency distributions, one showing the distrihu-
tion of the range estimate, the other showing the distribution of s. The
distribution of s is found to be more closely grouped about 0'; that is. s
usually gives a more accurate estimate of (J. Going a step further, it can
he shown that the range estimate, computed from normal samples of size
12, has roughly the same frequency distribution as that of s in samples of
size 10. We say that in samples of size 10 the relath'" efficiency of the range
estimator to s is about 10/12, or more accurately 0.850. The relative
efficiencies and the relative sample sizes appear in the third and fourth
columns of table 2.4.1 (p.40). In making a choice we have to weigh the
cost of more observations. If observations are costly. it is cheaper to
compute s.
Actually, both estimators are extensively used. Note that the rela-
tive efficiency of the range estimator remains high up to samples of sizes
8 to 10. In many operations, (J is estimated in practice by combining the
estimates from a substantial number of small samples. For instance, in
controlling the quality of an industrial process, small samples ofthc manu-
factured product are taken out and tested frequently. say every 15 min-
utes or every hour. Samples of size 5 are often used, the range estimator
being computed from each sample and plotted on a time-chart. The
efficiency of a single range estimate in a sample of size 5 is 0.955. and the
average of a series of ranges has the same efficiency.
The estimate from the range is an easy approximate check on the
computation of s. In these days, electronic computing machines are used
more and more for routine computations. Unless the investigator has
learned how to program, one consequence is that the details of his com-
putations are taken out of his hands. Errors in making the programmers
understand what is wanted and errors in giving instructions to the maw
chines are common. There is therefore an increasing need for quick
approximate checks on all the standard statistical computations. which the
investigator can apply when his results are handed to him. If a table of
(J/Ronge is not at hand. two rough rules may help For samples up to size
10. divide the range by ,In to estimate (J. Rememher also:
If n is near Then f1 is roughly estimated
this number by dividing range by

5 2
10 )

25 4
100 5

The range estimator and s are both sensitive to gross errors, because
• gross errol is likely to produce a highest or lowest sample membe that is
entirely false,
EXAMPLE 2.9.1-ln a sample of size 2, with measurements Xl and Xz, show that sis
(Xl - X 1 V"/2 = O.707\X\ - X 1(' and that the range estimator is 0.886!X 1 - Xl!, where the
vertic:allines denote. the absolute ",alue. The reason for the different multipliers is that the
range: estimator is constructed to be' an unbiased estimator of (J, while s is not, as already
mentioned.
EXAMPLE 2.9.2-The birth weights of 20 guinea pigs were: 30, 30, 26, 32, 30, 23, 29,
31,36,30,25,34,32',24,28,27,38,31,34,30 grams. Estimate u in 3 ways: (i) by the rough
approximation, onc~fourth of the range (Ans. 3.8 gm.); (ii) by use of the fraction, 0.268, in
table :,/,4.1 (Ans. 4.0 gm.); (iii) by calculating s (Ans. 3.85 gm.). N.B: Observe the time re-
quired to calculate s.
EXAMPLE 2.9.3--ln the preceding example, how many birth weights would be re-
quired to yi.eld the same precision if the range were used. instead of s7 A.ns. a.bout 29 weigh~
EXAMPLE 2.9.4---Suppose JOU lined up according to height 16 freshmen, then mea-
sured the heigh! of the shortest, 64 inches, and the- tallest, 72 inches. Would you accept the
midpoint of the range, (64 + 72)/2 = 68 inches as a rough estimate of p, and 8/3 = 2.7
inches as a quick-and-easy estimate of (1'!
EXAMPLE 2.9.5-ln a sample of 3,the values are, increasing order, Xl' Xl, and X3 •
The range estimate of u is 0.591(X3 - Xl) If you are ingenious at algebra. show that s
always lies between (X3 - X I )/2 = 0.5(X) - Xl)' and (X3 - Xd/.,/3. = 0.578(X3 - Xl)'
Verify the two extreme cases from the samples 0, 3, 6, in which s = 0.5(X3 - XI) lind 0, 0, 6,
in which s = 0.578(X3 - Xl)'

2.tO-Hints on the computation of s. Two results in algebra help to


shorten the cakulation of S, Both give quicker ways of finding Ex', If
G is any number, there is an algebraic identity to the effect that
Ex' = l:(X - X)' = E(X - G)' - (EX - nG)'jn
An equivalent alternative form is .
I:x 2 = E(X - X)2 = I:(X - G/~ nrX - G)2
These expressions are useful when s has to be computed without the aid of
a calculating machine (a task probably confined mainly to students nowa-
days), Suppose the sample total is EX = 350 and n = 17, The mean X
is 350/17 = 20,59, If the X's arc whole numbers, it is troublesome to take
deviations from a number like 2(),59. and still more so to square the
numbers without a machine. The trick is to take G (sometimes called the
48 Chap,er 2: Sampling From a Normally Distributed Papulation
guessed or working meun) equal to 20. Find the deviations of the X's
from 20 and the sum of squares of these deviations. L(X - G)'. To
get LX'. you have only (0 subtract n times the square of the difference
between.Y and G. or. in this case. 17(0.59)' = 5.92.
Prouf oj {he IdeHfity. We shall denote a typical value in the sample by
Xi' where the subscript i goes from 1 to n. Write
Xi - G = (X, - X) + (X - G)
S'luanng both sides. we have
(Xi - G)2 = (X, - X)2 + 21X.. - XliX - G) + I.Y - G)'

We now add over the n members of the sample. In the middle term on
the right. the term 2(X .- G I is a constant multiplier throughout this
addition. since this term does not contain the suhscript i that changes from
onc member ofthe.sample to another. Henl.:c
L2(Xi ._ XUX - G) ~ 2<)( - GIl:(Xi - X) = n.
since as we have seen previQus)y. the sum of the deviations from the sam-
ple mean is alwaY5zcro. Thl~ gIves

L(X i - G)' = L(Xi - X)2 + n(X - G)'


noting that the sum of the constant term (X - G)' over the sample is
niX - G)'. Moving this term 10 the other side. we get
L(X, - G)' - Il(.\' - G)' = L(X; - X)'
This completes the proof.
Incidentally. the result shows that for any value of G. L(Xi - X)'
is always smaller than L(Xi - Gl'. unless G = X. The sample mean ha,
the property that the sum of squares of deviations from it is it minimum.
The second aigebraic result. a particular case of the first. is used
when a calculating machine is avaiJable. rut G = 0 in the lirst result
in this section. We get ~

LX' = L(X - .x)' = LX' - (LX)';n


This result enables us to find LX' without computing any ofthedeviations.
For a set of po,itive numbers Xi' most calculating machines will compute
the sum of squares. LX'. and the sum. LX. simultaneously. without
writing down any intermediate figures. To get I:x 1 . we square the sum.
dividing by n. to give (LX)' in. and subtract this from the original sum of
square". rx' The computation will be illustrated for the 17 vitamin C
concentrations. Earlier. as mentioned. these data wore altered slightly to
simplify the presentation. The actual determinations were as follows.
16.22.21. 20.23. 22.17, Ij. 13,22.17,18.29,17,22.16.23
The only figures that need be written down are shown in table 2.10.1.
49
TABLE 2.10.1
COMPUTING THE SAMPLE MEAN AND SUM OF SQUARES OF DEVIATIONS
WITH A CALCULATING MACHINE

n,.,.17 !.X 2 =6.773


l:X = 333 (l:X)'!n ~ 6.522.88

X= 19.6 mg. per 100 gm. !.x 2 = 250.12


s' = 250.12/16 ~ 15.63
,= ,,115.63 = 3.95

When using this method, rememher that any constant number can
be subtracted from all the Xi without changing s. Thus if your data are
numbers like 1032, 1017, 1005. and so on, they can be read as 32,17,5,
and so on, when following the method in table 2.10.1
EXAMPLE 2.lO.I-For those who need practice.in using a guessed mean, here is a set
of numbers for easy computation:
15, 12, 10, 10. 10,8.7,7,4,4.1
First cakulate X = 8 and s = 4 by finding deviations from the sample mean. Then try
various guessed means, such as 5, 10, and I. Continue unlil you convince yourself that the
answers, X = 8 and s ~ 4, can be reached regardless of the value chosen for G. Finally,
try G = O. Note: With a guessed mean, X can be found without having to add the Xi' by
the relation
x = G + [l:(X - G)l!n
where the quantity 1:(X - G) is the sum of your rleviations from the guessed mean G.
EXAMPLE 2.10.2- For the ten patients in a previous example, the average differences
in hours of sleep per night between sedative and no sedative were (in hours): 0.7. 0.0. 3.7,
0.8,2.0, -1.6.3.4. - 0.2, - 1.2, - 0.1. With a calculating machine. compute s by the short·
cut method in table 2.10.1. Ans. s "'" 1.79 hrs. The range method gave 1.71 hr!S.

EXAMPLE 2.10.3---Without finding deviations from X and without u~ing a calculating


machine. compute1:.\'2 for the fOllowing measurements: 961. 953. 970. 958. 950. 951. 957.
Ans.286.9.

2.11-The standard deviation of sample means. With measmement


data, as mentioned previously. the purpose of an investigatiollis often to
estimate an average or total over a population (average selling price of
houses in a town, total wheat crop in a region). If the data are a random
sample from a population. the sample mean X is used to estimate the cor-
responding average (lver the population.· {urther, if the number of items
N in the population is known, the quantity NX is an estimator orthe popu-
lation total of the X's. This brings up the question: How accurate is a
sample mean as an estimator of the population mean '?
As usual. a question of this type can be examined either experimental-
ly or mathematically. With the experimental approach. we first find or
construct a population that seems typical of the type of populalion en-
countered in our work. Suppose that we are particularly interested in
50 Chapt., 2: Sompling From .. N....m..tly Oittribuled /1QIpulalioft
samples of size 100. We draw a large number of random samples of size
100, computing the sample mean]( for each sample. In this way we form
a frequency distribution of the sample means, or graph the frequencies
in a histogram. Since the mean of the population is known, we can find
out how often the sample mean is satisfactorily close to the popUlation
mean, and how often it gives a poor estimate.
Much mathematical work has been done on this problem and it
has prod ueed two of the most exciting and useful results in the whole of
statistical theory. These results, which are part of every statistician's
stock in \.Iade, will be stated first. Some experimental verification will
then be presented for illustration. The first result gives the mean and
standard deviation of X in repeated sampling; the second gives the shape
of the frequency distribution of X.
Mean and standard deviation of X. If repeated random samples of size n
are drawn from any population (not necessarily normal) that has mean Jl
and standard deviation a, the frequency distribution of the sample means
X in these repeated samples has mean Jl and standard deviation u/jn.
This result says that under random sampling the sample mean X is
an unbiased estimator of Jl: on the average, in repeated sampling, it will be
neither too high nor too low. Further, the sample means have less varia-
tion about Jl than the original observations. The larger the sample size,
the smaller this variation becomes.
Students sometimes find it difficult to reach the point at which the
phrase "the standard deviation of ](" has a concrete meaning for them.
Having been Introduced to the idea of a standard deviation, it is not too
hard to feel at home with a phrase like "the standard deviation of a man's
height." hecause every day we see taU men and short men, and reallze
that this standard deviation is a measure of the extent to which heights
vary from one rnan to another. But usually when we ha\lc a sample, we
calculate a single mean. Where does the variation come from? It is the
variation that would arise if we drew repeated samples from the popula-
tion that we are studying and computed the mean of each sample. The
experimental samplings presented in this chapter and m chapter 3 may
make this concept more realistic.
The standard deviation of X, u/.Jn, is often called. alternatively, the
srandard erJ'Or {~r X. -.. . T~he terms "standard deviation" and "standard
error" are synonymous. When we are studying the frequency distribution
of an estimator like X. its standard deviation supplies information about
the amount of error in X when used to estimate J1. Hence, the term
"standard error" is rather natural. Normally. we would not speak of the
standard error of a man's height. because if a man is unusually tall. this
does not imply that he has made a mistake in his height.
The quantity NX. often used to estimate a total over the popUlation,
is also an unbiased estimator under random sampling. Since N is simply
a fixed number. the mean of NX in repeated sampling is Nil. which. by
the definition ofJl, is the correct population total. The standard error of
51
NX is Nu/Jn. Another frequently used result is that the sample total,
1:X = nX, has a standard deviation nu/J1I, or uJn.

2.12-The frequency distribution of sample meaDS. The second major


result from statistical theory is that, whatever the shape of the frequency
distribution of the original population of X's, the frequency distribution
of X in repeated random samples of size n tends to become normal as 11
increases. To put the result more specifically, recall that if we wish to
express a variable X in standard measure, so that its mean is zero and its
standard deviation is 1, we change the variable from X to (X - Il)/U.
For X, the corresponding.expression in standard measure (sm) is

x _ (X - Il)
- - u/Jn

As 11 increases, the probability that X_ lies between any two limits L,


and L, becomes more and more equal to the probability that the standard
normal deviate Z lies between L, and L,. By expressing X in standard
measure, table A 3 (the cumulative normal distribution) can be used to
approximate the probability that X itself lies between any two limits.
This result, known as the Central Limit Theorem (5), explains why the
normal distribution and results derived from it are so commonly used
with sample means, even when the original population is not normal.
Apart from the condition of random sampling, the theorem requires
very few assumptions: it is sufficient that u is finite and that the sample
is a random sample from the popUlation.
To the practical worker, a key question is: how large must n be III
order to use the normal distribution for X? Unfortunately, no simple
general answer is available. With variates like the heights of men, the
original distribution is near enough normal so that normality may be as-
sumed for most purposes. In this case a sample with n = I is large enough.
There are also populations, at first sight quite different from the normal,
in which n = 4 or 5 will do. At the other extreme, some populations re-
quire sample sizes well over 100 before the distribution of X becomes at all
near to the normal distribution.
As illustrations of the Central Limit Theorem, the results of two
sampling experiments will be presented. In the first, the population is the
popUlation of random digits 0, I, 2, ... 9 which we met in chapter I.
This is a discrete population. The variable X has ten possible values
0, I, 2, ... 9, and has an equal probability 0.1 of taking any of these
values. The frequency distribution of X is represented in the upper part of
figure 2.12.1. Clearly, the distribution does not look like a normal dis-
tribution. Distributions of this type are sometimes called uniform, since
every value is equally likely.
Four hundred random samples of size 5 were drawn from the table
of random digits (p. 543),each sample being a group of five consecutive
52 CItapter 2: Sam,oIInt From .. Normally DiIIrfMmd ,..",ulDliolt
>.
c: O. I
0
II
:::lI
g-
...
Il
U.

..
II
.!!!:
0
'i)
0::: 0 2 3 4 5 6 7 8 9

oL-~~~ __L_~_L~__L_~~_
0.9 1.7 2.5 3.3 4.1 4.9 5.7 6.5 7.3 8.1
Value of X
FlO. 2.l2.I-Upper part: Theoretical probability distribution of tbe random digits from
o to 9. lower pan: Histogram showing [he distribution of 400 means of samples of size
S drawn from the random digits. The curve is the normal distribution with mean JJ = 4.5
and standard deviation 1'1/../1'1 = 2.872/ ../5 == 1.284.
53
numbers in a column. The frequency distribution of the sample means
appears in the lower half of figure 2.12.1. A normal distribution with
mean p. and standard deviation (1/../5 is also shown. The agreement is
surprisingly good, considering that the samples are only of size 5.
Calculation of p. and (1. In fitting this normal distribution, the quantities
p and (J were the mean and standard deviation of the original population
of random digits. Although the calculation of X and s for a sample has
been discussed, we have not explained how to calculate p. and (J for a
population. In a discrete population, denote the distinct values of the
measurement X by X" X" ... X.. In the population of random digits,
k = 10, and each value has an equal probability, one-tenth. In a more gen-
eral discrete population, the value XI may appear with probability or
relative frequency PI' We could, for example, have a popt.lation of
digits in which a 0 is 20 times as frequent as a I. Since the probabilities
must add to I, we have

L• P, =1
i= 1

The expression on the left is read "the sum of the P, from i equals 1 to k."
The population mean p. is defined as

p. = L PIX,
;= 1

Like X in a sample, the quantity p is the average or mean of the values of


XI in the population, noting, however, that each XI is weighted by its rela-
tive frequency of occurrence.
For the random digits, every P, = 0.1. Thus
p. = (0.1)(0 + I + 2 + 3 + 4 + 5 + 6 + 7 + 8 + 9) = (0.1)(45) = 4.5,
The population u comes from the deviations XI - p. With the
random digits, the first deviation is 0 - 4.5 = -4.5, and the successive
deviations are -3.5, -2.5, -1.5, -0.5, +0.5, +1.5, +2.5, +3.5, and
+4.5. The population variance, (1', is defined as

(1' = L• P,(X, _ p)'


j= 1

Thus, (1' is the weighted average of the squared deviations of the values
in the population from the population mean. Numerically,
(1' = (0.2j{(4.5)' + (3.5)' + (2.5)' + (1.5)' + (0.5)'} = 8.25
This gives (1 = ,,'8.25 = 2.872; so that (1/../5 = 1.284.
There is a shortcut method of finding (1' without computing any
54 Chapl.r 2: SGmp/iav From a Normally DiJlrihvteG Population
deviations: it is similar to the corresponding shortcut formula for l:x'.
The formula is:
,
(12 L PX2
= ,.[ j j - Jl.2

With the normal distribution, I' is, as above, the average of the values
of X, and u' is ttle average of the squared deviations from the population
mean. Since the normal population is continuous, having an infinite
number of values, formulas from the integral calculus are necessary in
writing down these definitions.
As a student or classroom exercise, drawing samples of size 5 from
the random digit tables is recommended as an easy way of seeing the
Central Limit Theorem at work. The total of each sample is quickly
obtained mentally. To avoid divisions by 5, work with sample totals
instead of means. The sample tota~ 51', has mean (5)(4.5) = 22.5 and
standard deviation (5)(1.284) = 6.420 in repeated sampling. In forming
the frequency distribution, put the totals 20, 21, 22, 23 in the central class,
each class containing four consecutive totals. Although rather broad,
this grouping is adequate unless, say, ·500 samples have been drawn.
The second sampling experiment illustrates the case in which a large
sample size must be drawn if X is to be nearly normal. This happens with
populations that are markedly skew, particularly if there are a few values
very far from the mean. The population chosen consisted of the sIZes
(number of inhabitants) of U.S. cities having over 50,000 inhabitants in
1950 (6), excluding the four largest cities. All except one have sizes rang-
ing between 50,000 and 1,000,000. The exception, the largest city in the
popUlation, contained 1,850,000 inhabitants. The frequency distribution
is shown at the top of figure 2.12.2. Note how asymmetrical the distri-
bution is, the smallest class having much the highest frequency. The city
with 1,850,000 inhabjtants is not shown on this histogram: it would ap-
pear about 4 inches to the right of the largest class.
A set of 500 random samples with n = 25 and another set with n = 100
were drawn. The frequency distributions of the sample means appear
in the middle and lower parts of figure 2.12.2. With n = 25, the distribu-
tion has moved towards the normal shape but is still noticeably asymmetri-
cal. There is some further improvement towards symmetry with n = 100,
but a normal curve would still be a poor fit. Evidently, samples of 400-
500 would be necessary to use the normal approximation with any as-
surance. Part of the trouble is caused by the 1,850,000 city: the means
for n = 100 would be mOre nearly normal if this city had been excluded
from the population. On the other hand, the situation would be worse if
the four largest cities had been included.
Combining the theorems in this and the previous section, we now
have the very useful result that in samples of reasonable size, X is approxi-
mately normally distributed about 1', with standard deviation or standard
errorul,jn.
55
r- Original
100 Population
>-
u
I:
a>
:::!
c:r
a> 50 f-
...
u.
I-

n , , , ,
0

__
100 200 300 400 500 600 700 800 900 1,000

Means of
100 r-
Samples of 25
r--
>- 80 r-
u
I:
a> 60 r I--

_
:::!
c:r I--
...
a>
U.
40 r--

20
I-,
o I--
80 120 160 200 240 280 320

100
r--_ Meons of
BO '- Samples of 100
>- I--
U - -
I:
a> 60 - ,
,

-
:::J
c:r
...a>
U.
40
r--

20 ~

I--
o ,
120 140 60 180 200 220 240
FIG 2.12.2-Top part: Frequency distribution of the populations of 228 U.S. citie5 having
populations over 50.000 in 1950. Middle pan: Frequency distribution of the means 0(500
random samples of size 25. Bottom part; Frequency distribution of the means of 500
I'ilndorn 9amples of size 100.
S6 Chapter 2: Samplitrg From a Normally Diltributed I'apulatitM
EXAMPLE 2.12.1-A population of heights of men has a standard deviation (1 = 2.6
inches. What is the standard error of the mean of a random sample of (i) 25 men, (ii) 100
men? Ans. (i) 0.52 in. (ii) 0.26 in.
EXAMPLE 2.12.2-10 order to estimate the total weight of a batch of 196 bags that
a(,e to be shipped, each of a random sample of 36 bags is weigbed., giving X = 40 lhi. As-
suming (1 = 3Ibs., estimate the total weight of the 196 bags and give the standard error of
your estimate. Ans. 7,840 1bs.; standard error, 981bs.
EXAMPLE 2.12.3-10 estimating the mean beight of a large group of boys with
(J = 1.5 in., how large a sample must be tak~n if the standard error of the mean height is to
be 0.2 in.? Ans. 56 boys.
EXAMPLE 2.12.4-If perfect dice are thrown repeatedly, the probability is 1/6 that
each of the raCes 1,2,3,4,5,6 turns up. Compute", and t1 for this population. Ans. p. "'" 3.S,
Q=1.7J.
EXAMPLE 2. 12.S-Ifhoys and gIrls are equally likely, the probabilities that a family of
size two contains 0, I, 2 hoys are, respectively. 1/4, 1/2, and 1/4 Find", and t1 for this
population. Ans. '" = I, (/ = I/J2 = 0.71.
EXAMPLE 2.12.6--The following sam.pling experiment shows how the Central Limit
Theorem perrorms with a population simulating what is called a u-shaped distribution. In
the random digits table, score 0, 1,2,3 as 0; 4, 5 as I; and 6, 7,8,9 as 2. In this population.
the probabilities of score 0[0, I, 2 and 0.4, 0.2, and 0.4. respectively. This is a discrete dis-
tribution in which the central ordinate. 0.2., is lower than the two outside ordinates, 0.4.
Draw a number of samples of size 5, using the random digits table. Record the total score
for each sample. The distribution of total scores will be found fairly similar to the bell-
shaped normal curve. The theoretical distribution of the total scores is as follows;

Score oor 10 lor 9 2 or 8 3 or 7 4 or 6 5


Prob. .010 .026 .077 .115 .i82 .179
That is, the probability of a 0 and that of a 10 are both 0.010.

2.I3-Confidence intervals for I" when <1 is known. Given a random


sample of size n from a population, where n is large enough so that X can
be assumed normally distributed, we are now in a position to make an
interval estimate of 1". For simplicity, we assume in this section that
(J is known. This is not commonly sO in practice. In some situations.
however. previous populations similar to the one now being investigated
all nave about the same standard deviation, which is known from these
previous results. Further, the value of (J can sometimes be found from
theoretical considerations about the nature of the population,
We first show how to find a 9S'Y" confidence interval. In section 2.1
it was pointed out that if a variate X is drawrt from a normal distribution,
the probability is about 0.95 that X lies between I" - 2<1 and I" + 2<1.
More exactly, the limits corresponding to a probability 0.95 are I" - 1.96<1
and I' + 1.96<1. Apply this result to X, remembering that in repeated
sampling X has a stan_dard deviation <1/.,jIl. Thus, unless an unlucky 5%
chance has come off. X will lie between I" - 1.96<1/Jn and I" + 1.96(1/Jn.
Expressing this as a pair of inequalities, we write

I' - 1.96(1/,/n ,;; X ,;; /' + 1.96a/Jn


51
apart from a 5% chance. These inequalities can be rewritten so that they
provide limits for I' when we know X. The left-band inequality is equiva-
lent to the statement that

I' s; X + 1.96u/.Jn
In the same way, the right-hand inequalitY implies that

I' '" X - 1.96u/.Jn


Putting the two together, we reach the statement that unless an unlucky
5% chance occurred in drawing the sample,

X - 1.9OO/.Jn S; I' S; X + 1.96u/.Jn


This is the 95% confidence interval for 1'.
Similarly, the 99% confidence interval for I' is

X - 2.58a/.Jn S; I' S; X + 2.58a/.Jn


because the probability is 0.99 that a normal deviate Z lies betwcen the
limits - 2.58 and + 2.58.
To find the confidence interval corresponding to any confidence prob-
ability P, read fFom the cumulative normal table (table A 3) a value Zp,
say, such that the area given in the table is P/2. Then the probability that
a normal deviate lies between - Zp and + Zp will be P. The confidence
interval is

X - Zpa/.jn s; I' s; X + Zpa/.jn


One-sided confidence limits. Sometimes we want to find only an upper
limit or a lower limit for 1', but not both. A company making large
batches of a chemical product might have, as part of its quality control
program, a regulation that each batch be tested to ensure that it does not
contain more than 25 parts per million of a certain impurity, apart from
a I in 100 chance. The test consists of drawing out n amounts of the prod-
uct from the batch, and determining the concentration of impurity in
each amount. If the batch is to pass the test, the 99% upper confidence
limit for I' must be not more than 25 parts per million. Similarly, certain
roots of tropical trees are a source of a potent inseCtic;ide whose concen-
tration varies considerably from root to roOI. The buyer of a large ship-
ment of these roots wants a guarantee that the concentration of the active
ingredient in the shipment exceeds some stated value. It may be agreed
between buyer and seller that the shipment is acceptable if, say, the 95~~
lower confidence limit for the average concentration I' exceeds the desired
minimum.
To find a one-sided or one-tailed limit with confidence probability
'95~~, we want a normal deviate Z such that tile area beyond Z in one tail
is 0.05. [n table A 3, the area from 0 to Z will be 0.45, and the value of Z
58 Chapt., 2: Sampling From tI Hormolly Dirlriburecl Popu''''ion
is 1.645. Apart from a 5% chance in drawing the sample,

x :s; I' + 1.645u/.jn


This gives, as the 'ower 95% confidence limit for 1',

JJ 2! X - J.645u/.jn

The upper limit is X + 1.645u/.jn. For 99% limit the value of Z is 2.326.
For a one-sided limit with confidence probability P (expressed as a pro-
portion), read table A 3 to find the Z that corresponds to probability
(P - 0.5).
2.14--Size of sample. The question: How large a sample must I
take? is frequently asked by investigators. The question is not easy to
answer. But if the purpose of the investigation is to estimate the mean
of a population from the results of a sample, the methods in the preceding
sections are helpful.
First, the investigator must state how accurate he would like his
sample estimate to be. Does he want it to be correct to within I unit, 5
units, or IO units, on the scale on which he is measuring? In trying to
answer this question, he thinks of the purposes to which the estimate will
be put, and tries to envisage the consequences of having errors of different
amounts in the estimate. If the estimate is to be made in order to guide
a specific business or financial decision, calculations may indicate the
level of accuracy necessary to make the estimate useful. In scientific re-
search it is often harder to do this, and there may be an element of arbi-
trariness in the answer finally given.
By one means or another, the Investigator states that he would like
his estimate to be correct to within some limit ± L, say. Since the normal
curve extends from minus infinity to plus infinity, we cannot guarantee
that X is certain to lie between the limits I' - L and I' + L. We can, how-
ever, make the probability that X lies between these limits as large as we
please. In practice, this probability is usually set at 95% or 99% For
the 95% probability, we know that there is a 95% chance that X lies be-
tween the limits I' - 1.9OO/.jn and I' + 1.96(1/.jn. This gives the equation
1.96(1/.jn = L
which is solved for n.
The equation requires a knowledge of (I, although the sample has
not yet been drawn. From previous work on this or similar popUlations,
the investigator guesses a value of u. Since this guess is likely to be some-
what in error, we might as well replace 1.96 by 2 for simplicity. This gives
the formula
n = 4<1 2/L2
The formula for 99~~ probability is n = 6.OO 2 /L'
59
To summarize. the investigator must supply: (i) an upper limit L to
the amount of error that he can tolerate in the estimate. (ii) the desired
probability that the estimate will lie within this limit of error. and (iii)
an advance guess at the population standard deviation u. The formula
for n is then very simple.
EXAMPLE 2.14.1-Find (i) the 80010. (ii) the 90% confidence limits for,.,.. given X and
". An•. (i) X ± 1.28a/.jn.(ii) X ± 1.64<1/.jn.
EXAMPLE 2.14.2-The heights of a random sample of 16 men from a population with
(1 in. are measured. What is the confidence probability that X does not differ from JJ
= 2.6
by more than 1 in.? Ans. P = 0.876.
EXAMPLE 2.t4.3-For the insecticide roots, the buyer wants assurance that the
average content of the active ingredient is at least 8 100. per 100 Ibs., apart from a 1-in-lOO
chance. A sample of9 bundles of roots drawn from the batch gives, on analysis, X = 10.2
Ibs. active ingredient per 100 Ibs. If q = 3.31bs. per 100 Ibs .. find the lower 99% confidence
limit for Ji. Does the batch meet the specification? Ans. Lower limit = 7.6lbs. per 100 lbs.
No.
EXAMPLE 2.14.4-1n the auditing of a firm's accounts rtCe1vab\e, \00 entries were
checked out of a ledger containing 1,000 entries. For these 100 entries, the auditor's check
showed that the stated total amount receivable exceeded the correct amount receivable by
$214. Calculate an upper 95% confidence limjt for the all)ount by which the reported total
receivable in the whole ledger exceeds the correct amount. Assume q = $1.30 in the popu-
lation of the bookkeeping errors. Ans. $2,354. Note: for an estimated population total,
the formula for a one-sided upper limit for Nfl is NX + NZ(1/..jn. Note also that you are
given the sample lOla/ nX = $214.
EXAMPLE 2.14.5-When measurements are rounded to the nearest whole number, it
can often be assumed that the eITor due to rounding is equally likely to lie anywhere between
-0.5 and +0.5. That is, rounding errors follow a uniform distribution between the limits
-0.5 and +0.5. From theory, this distribution has p. = 0, (I = IlJ12 = 0.29. If 100 inde-
pendent, rounded measurements are added, what is the probability that the error in the
total due to rounding does not exceed 5? Ans. P = 0.916.
EXAMPLE 2.14.~In the part of a large city in which houses are rented, an economist
wishes to estimate the average monthly rent correct to within ±$20, apart from a l-in-20
chance. If he guesses that (J is about $60, how many houses must he include in his sample?
Ans. n = 36.
EXAMPLE 2.14.7-Suppose that in the previous example the economist would like
~Ia probability that his estimate is correct to within $20. Further, he learns that in a recent
sam-ple of 100 houses, the lowest rent was $30 and the highest was 5260. Estimating {f from
these data. find the sample size needed. Ans. n == 36. This estimate is, of course, very rough.
EXAMPLE 2.l4.8-Show that if we wish to cut the limit of error from L to L/2. the
sample size must be quadrupled. With the same L, if we wish ,~Ia probability of being
within the limit rather than 95% probability, what percentag~ increase in sample size is
required? Ans. about 65% increase.

2.1S-"Student's" t-distribution. In most applications in which


sample means are used to estimate population means. the value of u
is not known. We can. however. obtain an estimate s of u from the sample
data that give us the value of X. If the sample is of size n. the estimate s
is based on (n - I) degrees of freedom. We require a distribution that will
enable us to compute confidence limits for IJ.lmowing s but not u. Known
2.5% .5%

Flo. 2. IS. I- Dislribution of t with 4 degrees of freedom . The shaded areas comprise
S% of the total area. The distribution is more peaked in Ihe center and has higher tails
than the normaJ.

as "Student's" t-distribution. this result was discovered by W. S. Gosset


in 1908 (7) and perfected by R . A. Fisher in 1926 (8). This distribution
has revolutionized the statistics of small samples. In the next chapter you
will be asked to verify the distribution by the same lcind of sampling
process you used for chi-square; indeed. it was by such sampling that
Gosset first learned about it.
The quantity I is given by the equation.
X-jJ.
1=--
S/JII
That is, , is the deviatio n of tbe estimated mean from that of the popula-
tion. measured in terms of s/J 11 as the unh. We do not know J.I. though
we may have some hzyothesis about it. Without jJ.. , cannot be calcu-
lated ; but its sampling distribution has been worked out.
The denominator. s/J 1I. is a useful quantity estimating (f/Jn, the
sla1ldnrd error of X.
The distribution of ( is laid out in table A 4, p. 549. In large samples
it is practically normal with J.I. = 0 a nd (f = I . It is only for samples of less
than 30 that the distinction becomes obvious.
Like tbe normal, the I-distributio n is symmetrical about the mean.
T his allows the probability in the table to be stated as that of a larger
absolute value. sign ignored. For a sample of size 5, with 4 degrees of
freeedom. figure 2. 15.1 shows such val ues of I in the shaded areas; 2.5%
of them are in one tai l and 2.5% in the o tber. Effectively, the table shows
the two ha lves of the figure superimposed, giving the sum of the shaded
areas (probabilities) in both.
61
EXAMPLE 2.15.1-10 the vitamin C sampling of table 2.8.l,sr = 3.98/J17 = 0.965
mg.jlOO gm. Set up the hypothesis that jJ "'" 17.954 mg./IOO gm. Calculate t. ADS. 2.12.
EXAMPLE 2.15.2~For the vitamin C sample. degrees of freedom = 17 - I = 16, the
denominator of the fraction giving S2. From table A 4, find the probability of a value of t
larger in absolute value than 2.12. ADs. 0.05. This means that, among random samples of
" = 17 from normal populations, 5% of them are expected to have ,·values below -2.12 or
above 2.12.
EXAMPLE2.15.3---Ifsamples ofn = 17 are randomly drawn from a normal population
and have t calculated for each, what is the probability that I will fall between -2.12 aod
+2.l2? ADS. 0.95.
EXAMPLE 2.15.4-lfrandom samples ofn = 17 are drawn from a normal population,
what is the probability of t greater than 2.12? Ans.0.025.
EXAMPLE 2.IS.S-What size of sample would have I> 121 in S% of all random
samples from normal populations? Ans.61. (Note the symbol for "absolute value," tbalis,
ignoring signs.)
EXAMPLE 2.IS.6---Among very large samples (d! = 00), what value of t would be
exceeded in 2.S% of them? Ans.l.96.

2.Ie>-Confideoce limits for Il based on tbe , ....istribution. Witb IT


known. the 95% limits for Il were given hy the relations
X - 1.96tr/.jn ,;; Ji ,;; X + 1.96tr/.jn
When IT is replaced by s. the only change needed is to replace the number
1.96 by a quantity which we call 10 . 0 " To find 10 . 0 " read table A 4 in the
column headed 0.050 and find the value of I for the number of degrees of
freedom in s. When the df are infinite, 10 . 0 , = 1.960. With 40 df, 10 . 0 ,
has increased to 2.021, with 20 df it has become 2.086, and it continues
to increase steadily as the number of df decline.
The inequalities giving the 95% confidence limits then become
X - lo.o,s/.jn,;; I' ,;; X + lo.o,s/.jn
As illustration, recall the vitamin C determinations in table 2.8.1; n = 17,
s"
X = 20 and = 0.965 mg.!IOO gm. To get the 95~~ confidence interval
(interval estimate):
I. Enter the table with d.{ = 17 - I = 16 and in the column headed
0.05 take the entry, 10 . 0 , = 2.12.
2. Calculate the quantity,
lo.o,'x = (2.12)(0.965) = 2.05 mg./IOO gm.
3. The confidence interval is from
20 - 2.05 = 17.95 to 20 + 2.05 = 22.05 mg./IOO gm.
If you say that Illies inside the interval from 17.95 to 22.05 mg./IOO gm.,
you will be right unless a l-in-20 chance has occurred in the sampling.
The point and 95~{, interval estimate of 11 may be summarized Ihis
way: 20 ± 2.05 mg./IOO gm.
62 CItapt.r 2: Smnpling From a Normally Dirlribul.d Population
The proof of this result is similar to that given when (] is known:
Although J.! is unknown, the drawing of a random sample creates a value of

X-J.!
1=--
5/";"
that follows Student's I-distribution with (n - 1) d.f. Now the quantity
0.95 that a value
10 .05 in table A 4 was computed so that the probability is
of I drawn at random lies between - /0.0' and t o .o,. +
Thus, there is a
95~'~ chance that

X-J.!
- 10 . 0 ,:5 s/";n :5 + 10 . 05
Multiply throughout by s/";n, and then add J.! to each term in the in·
equalities. This gives, with 95% probability,
JJ - 10.0,S/";" :5 X :5 J.! + lo.o,s/";n
The remainder of the proof is exactly the same as for (] known. The limits
may be expressed more compactly as X ± /0.0''<'' For a one-sided 95%
limit, use 10 . 10 in place of to.os.

EXAMPLE 2.16.1- The yields of alfalfa from 10 plots were 0.8,1.3,1.5,1.7.1.7,1.8,


2.0, 2.0, 2.0, arn:l2.2 tons per acre. Set 95% limits on the mean of the poP!llation of whkh
this is a random sample. Ans. 1.41 and 1.99 tons per acre.
EXAMPLE 2.16.2--ln an investigation of growth in school cbildren in private schools,
the sample mean height of 265 boys of age 13 1/2-14 1/2 years was 63.84 ill. with standard
deviation s = ].08 in. What is the 9S~(, confidence interval for IJ" Ans. 63 5 to 64.2 in.
EXAMPLE 2.16.3-·ln a check of a day's work for each of a sClmple of 16 women
engaged in tedious, repetitive work, the average number of minor errors per day was 5.6,
with a sample s.d. of 3,6. Find (i) a 90% confidence interval for the population mean
number of errors, (ii) a one-sided upper 90% limit to the population number of errors.
Ans. (i) 4.0 to 7.2. (ii) 6.8.
EXAMPLE 2.16.4--We have stated that the t-distribution differs clearly from the
normal distribution only for samples of size less than 30. For a given value 01'.\\. how much
wider is (i) the 95~'o (ii) the 99"/~ confidence interval when the sample size is 30 than when the
sample size is very large'.' Are there samples sizes for which the 95(., and 99/., interV<lls
become twice as wide. for the same X.t . <IS with very large samples? Ans. 0) 4.3'\, wider
(1i) 7.0% wider, since '~K has 29 df For a sample of size 3 (2 dI) the 95% interv<l1 is twice
as wide, and for a sample of size 4 the 99/.', interval is twice as wide. With ~mall samples. s
is not a good estimate of G, and the confidence limits widen to allow for the chance that the
sample s is far removed from the true (J.
2.l7-Relative variation. Coefficient of variation. In describing the
amount of variation in a population. a measure often used is the {'oe{ft"dem
of mrialion C = "ill. The sample estimate is siX. The standard devi-
ation is expressed as a fraction, or sometimt!s as a percentage.. of the mean.
The utility of this measure lies partly in the fact that in many ,erics the
mean and standard deviation tend to change together. This 15 illustrated
63
by the mean stalure and corresponding standard deviation of girls from
I to 18 years of age shown graphically in figure 2.17.1. Until the twelfth
year the standard deviation increases at a somewhat greater rate, relative
to its mean, than does stature, causing the coefficient of variation to rise,
but by the seventeenth year and thereafter C is back to where it started.
Without serious discrepancy one may fix in mind the figure, C = 3.75%,
as the relative standard deviation of adult human stature, male as well as
female. More precisely, the coefficient rises rather steadily from infancy
through puberty, falls sharply during a brief period of uniformity, then
takes on its permanent value near 3.75%.
A knowledge of relative variation is valuable in evaluating experi-
ments. After the statistics of an experiment are summarized, one may
judge of its success partly by looking at C. In corn variety trials, for exam-
ple, although mean yield and standard deviation vary with location and
season, yet the coefficient of variation is often between 5% and 15%.
Values outside this interval cause the investigator 10 wonder if an error
has been made in calculation, or if SOme unusual circumstances throw
doubt on the validity of the experiment. Similarly, each sampler knows
what values of C may be expected in his own data, and is suspicious of
any great deviation. If another worker with the same type of measure-
ment reports C values much smaller thal) one's own, it is worthwhile to
try to discover why, since the reason may suggest ways of improving one's
precision.

... / "\
?

".
I ., - .iw
/. .I ~

r- It-tt. ~
w

III_III .. " .." .

,,"
\
v -- -... .z
w
~

!!
.1/ ~,!I"
z
0

;1--:7
,'-" ".,.,.tl' 4~
/
_ ... , " ':z
0
•;
l/'--V
~G'''flC ...t .;:"r..tI..
0
,-'
z
~ ~ ~

J• 100 " • •

.. i?'
.' ,. '\ v- '"z
~

"
~
1'\...- .l!!
,- ~

••• • • • • ,.
AV(R4GiE AGE: ... 'l'1E.RI"
.
, ,• .•
•w
.g

fIG. 2.17.1-Qraph of 3 time series ~ stature, standard deviation. and coefficient of varia·
tion 'Of girls from I to 18 years of age. See reference (1),
Other uses of the coefficient of variation are numerous but less prev-
alent. Since C is the ratio of two averages having the same unit of mea-
surement it is itself independent of the unit employed. Thus, C is the same
whether inches, feet, or centimeters are used to measure height. Also,
tbe coefficient of variation of tbe yield of hay is comparable to that of the
yield of corn. Experimental animals have characteristic coefficients of
variation, and these may be compared despite the diversity of the variables
measured. Such information is often useful in guessing a value of (T for
the estimation of sample size as in section 2.14.
Like many other ratios, the coefficient of variation is so convenient
that some people overlook the information contained in the original data.
Try to imagine how limited you would be in interpreting the stature-of-
girls coefficients if they were not accompanied by X and s. You would
not know whether an increase in C is due to a rising s or a falling X, nor
whether the saw-tooth appearance of the C-curve results from irregulari-
ties in one or both of the others, unless indeed you could supply the facts
from your own fund of knowledge. The coefficient is informative and use-
ful in the presence of X and s, but abstracted from them it may be mis-
leading.
EXAMPLE 2.17, I-In experiments involvins chlorophyll determinations in pineapple
plants (10). the question was raised as to the method that would give the most consistent
results. Three bases of measurement-were tried. each involving 12-1eaf samples, with the
sbltistics reported below. From the coefficients of variation, it was decided that the methods
were equaUy reliable, and the most convenient ODe could be chosen with no sacrifice of pre-
cision.

STATImCS Of' CuLOIlOPHYLL DETaMINAnoNS OF 12-LEAF SAMPLES FROM PlNEMPLE


PLANTS., USING THREE BASES OF MEASUREMENT

tOO-gram lOO-gram lOO-sq. em.


S<alistic Wet Basi> Dry Basls Buis

SampJ~ Mean (milligrams) 61.4 337 13.71


Sample Standard Deviation (milligrams) 5.22 31.2 1.20
Coefficient of Variation <Per cent) 8.S 9.3 8.8

EXAMPLE 2.17.2-10 a cens;» Jaboratory there is a roJony of rats in wbieh t.be coeffi·
cient of variation of the weights of males between 56 and 84 days of aae is close to 13%.
Estimate the sample standard deviation of the weights of a lot of these: rats whost: sample
mean weight is 200 grams. Ans. 26 grams.
EXAMPLE 2.17.3--lf C is the coefficient of variation in a population. show that the
coefficient of variation oftbe mean of a random sample of size n is C/../n in repeated samplina.
Docs (he same result hold for the sample (otal? Alai. Yes.

EXAMPLE 2. 17.4--1f tbe coefficient of variation of the gain in weight of a certain


animal over a month is 10C'/.,. what would you e''\pcct the coefficient of variation oftbe gain
over a four-month period to~? ADS. The answer is complicated, and cannot he given
fuUy al tbis stage. ]f q and p were tbe same during each of the four months. an~ if Ihe
pius were indepentknt from month to month the answer would "'= C/..j4 = C!2. by the
result in the preceding example. But animals sometimes grow by spuns. so that tbe gains in
65
successive periods may not be independent, and our formula for the standard deviation of
a sample does Dot apply in this case. The answer is likely to lie between C and C!2. The
point win be clarified when we study correlation.
REFERENCES
I. COUNCIL ON FOODS. JAMA, 110:651 (1938).
2. E. S. PEARSON. Biometdka, 24:416 (1932).
3. L. H. C. TIPPETT. Biometrika, 17:386 (1925).
4. A. R. CuSHNY and A. R.I'EEBLES. Am". J. Physio/., 32:501 (1905).
S. A. M. MOOD and F. A. GRAYBILL. Introduction 10 the Theory (Jf SlfJliJlics, 2nd ed.
McGraw-Hill, New York (1963).
6. Statistical Abstract of the United States, U.S. GPO, Washington, D.C. (1959).
7. "Student." Biometrika, 6: I (1908),
8. R. A. FISHl!lt. Metron, 5:90 (1926).
9. P. J. TALL"". Plant Physioi., 9:737 (1934).
to. R. K. TAM and O. C. MAGI5TAD. Plant Physioi .• 10: 161 (l93!S).
* CHAPTER THREE

Experimental sampling from


a normal population

3.t-lntroduction. In chapter I the facts about confidence intervals


for a proportion were verified through experimental sampling. This
same device illustrated the theoretical distribution of chi-square that forms
the basis of the test ofa null hypothesis about the population proportion.
In chapter 2 the results of two experimental samplings were presented to
show that the distribution of means of random samples tends to approxi-
mate the normal distribution with standard deviation u/.Jn, as predicted
by the Central Limit Theorem. "
In this chapter we present further experimen"tal samplings from a
population simulating the normal, with instructions so that the reader
can perform his own samplings. The purposes are as follows:
(I) To provide additional verification of the result that the sample
means are normally distributed with S.D. = u/.Jn.
(2) To investigate the sampling distribution of Sl, regarded as an
estimate of u ' , and of s, regarded as an estimate of u. Thus far we have
not been much concerned with the question: How good an estimate of
u 2 is S2? The frequency distribution of 52 in normal samples has, however,
been worked out and tabulated. Apart from a mUltiplier, it is an extended
form of the chi·square distribution which we met in chapter I.
(3) To illustrate the sampling distribution of ( with 9 degrees of
freedom, by comparing the values of (found in the experimental sampling
with the theoretical distribution.
(4) To verify confidence interval statements based on the (-distribu-
tion.
The population that we have devised to simulate a normal population
departs from it in two respects: it is limited in size and range instead of
being infinite, and has a discontinuous variate instead of the continuous
one implied in the theory. The effects of these departures will scarcely be
noticed, because they are small in comparison with sampling variation.

3.2-A IIoite population simulating the normal. In table 3.2.1 are


the weight gains of a hundred swine, slightly modified from experimental
data so as to form a distribution which is approximately normal with
66
67
TABLI! 3.2.1
AIulAy OF GAINS IN WEIGHT (PolfNDS) Of 100 SWINE DuluNo A PERJooOF 20 DAYS
The gains approximate a normal distribution with
I' = 30 pounds ud (I = 10 pounds

Item I,, Item Item


Number Gain
Item
Number Gain
Number Gain Number Gain

00 3 25 24 SO 30 75 37
01 7 26 24 51 30 76 37
02 II 27 24 52 30 77 31
03 12 21 25 53 30 78 38
04 13 29 25 54 30 79 39
05 14 30 2S 55 31 80 39
06 15 31 26 56 31 81 39
07 16 32 26 57 31 82 40
08 17 33 26 58 31 83 40
09 17 34 26 59 32 84 41
10 18 35 27 60 32 85 41
II 18 36 27 61 33 86 41
12 18 37 27 62 33 87 42
13 19 38 28 63 33 88 42
14 19 39 21 64 33 89 42
IS 19 40 28 65 33 90 43
16 20 41 29 66 34 91 43
17 20 42 29 67 34 92 44
18 21 43 29 68 34 93 45
19 21 44 ~9 69 35 94 46
20 21 45 30 70 35 95 47
21 22 46 30 71 35 96 48
22 22 47 30 72 36 97 49
23 23 48 30 73 36 98 53
24 23 49 30 74 36 99 57

J.I = 30 pounds and a = 10 pounds. The items are numbered from 00 to


99 in order that they may be identified easily with corresponding numbers
taken from the table of random digits. The salient features of this kind
of distribution may be discerned in figure 3.2.1. The gains, clustering at
the midpoint of the array, thin out SyJllmetrically, slowly at first, then more
and more rapidly: two-thirds of the gains lie in the interval 30 ± 10
pounds, that is, in an interval of two standard deviations centered on the
mean. In a real. population, indefinitely groat in number of individuals,
greater extremes doubtless would exist, but that need caUse us little con-
cern.
The relation of the histogram to the array is clear. After the class
bounds are decided upon, it is necessary merely to count the dots lying
between the vertical lines, then make the height of the rectangle propor-
tional to their number. The central value, or clDs. mmk, of each intcrval
is indicated on the horizcntal scale of gains.
In table 3.2.2 is the frequency distribution which is graphically repre-
sented in figure 3.2.1. Only the class marks are entered in the first row.
The class intervals are from 2.5 to 7", etc.

5
61 CIoapter 3, fie,..,.".""" Scallp/_1II ,.,."" " Normol Populcrlion

100 f-
. . . .
f- ... '

,II
eo f- ,.,
~ "'TO r-
..,
I'
III I
~GO I'
~~ f-
%-'10 l-
.I
I
Q ,I
t~ ,..:
'"
RZO f-
i ""
10 r-

o .. .' .'
w
~!) I-
~
~ ~O
Z
IoJ
I-- I--
~ I!> l- I-- ~
01-
~
~ 10-
!>

0 , ,
o ~ 10 I~ ~o %~ :30 :35 40 4~ ~ 55 (#C)
GAIN IN POUNDS
fiG. 3.2J-Upper part: Graphical representation of array of 100 normally distributed
gains in weight. Lower part: Histogram of same gains, The altitude of a rectangle in the
w$togram is proportional to the number of dots in the array which He between the vertica1
sides,
TABLE 3.2.2
f'RIiQUENCY DIsTRIBUTION OF GAINS IN WmGHT OF )00 SWINe
(A finite population approximating the normal)

CIa» mark (pounds) 5 10 IS 20 25 30 35 40 45 sa 55

Frequency 2 2 6 I) IS 23 16 13 6 2 2

3.3-Ramlom samples from a normal distn1JuDoo. An easy way to


draw random samples from the table of pig gains is to take numbers con-
secutively from the table of random numbers, table A.I, tben matcb them
with the gains by means of the integers, 00 to 99, in table 3.2.1. To avoid
duplicating the samples of others in class work, start at some randomly
selected point in the table of random numbers instead of at the beginning.
tben proceed upward, downward, or crosswise. Suppose you have hit
upon the digit, 8, in row 71, column 29. This, with the following digit, 3,
specifies pig number 83 in table. 3.2.1, a pig whose gain is 40 pounds.
Hence, 40 pounds is the first number of the sample. Moving upward
among the random numbers you read the integers 09, 75, 90, etc., and
recora the corresponding gains from the table, 17, 37, and 43 pounds.
Continuing. you get as many gains and a. many samples as you wish.
Samples of 10 are suggested. For our present purposes all the sam-
ples must be of the same size because the distributions of their statistics
TABLE 3.3.1
FOUJ SAWPLESOF 10 JTEJa DRAWN AT RANDOM FROM THE PIG GAINS OF TA8U 3.2.1,
EACH FocLOWfD BY STATISTICS To Bf EXPLAINED IN SECTIONS 3.4-3.8
-
Sample Number
Item Number
ODd Formulas I 2 3 4

I 33 32 :l9 17
1 53 31 34 22
3 34 II 33 20

5
29
39
30
19
33
33
19
3
6
7
8
9
10
57
12
24
39
36
..
2A
'3
19
30
39
36
32
3l
30
....
21
3
25
40
21

X 35.6 29.3 34.1 19.1

"
I
169.1
1).0
4.11
151.6
12.3
3.89
9.0
3.0
0."
112.3
10.6
3.35
,
'r
1.36 -0.18 4.32 -3.25
to.osSf 9.3 1.8 2.2 1.6
Xi I •. o~r 26.3-44.9 2O..s-38.1 31.9--36.3 11 •.s-26.7
10 Chapter 3: &,..,imemal Sampling from a Normal Population
change with n. It is well to record the items in columns, leaving a half
dozen lines below each for subsequent computations. For your guidance,
four samples are listed in table 3.3.1. The computations below them will
be explaiued as we go along. Draw as many of the samples as you think
you can process within the time at your command. If several are working
together, the results of each can be made available to all. Keep the records
carefully-because you will need them again and again.
Each pig gain may be drawn as often as its number appears in the
table of randdm digits-it is not withdrawn from circulation after being
taken once. Thus, the sampling is always from the same population, and
the probability of drawing any particular item is constant throughout
the process.
EXAMPLE 3.3.1-Determine the range in c:acb of your samples of PI = 10. The
mean of the ranges ntimates afO.325 (table 2.4.1); that is. 10/0.325 = 30.8. How close is
your estimate?

3.4.-The distribution of sample means. First add the items in each


sample, then put down the sample mean, X (division is by 10). While
every mean is an estimator of J.I = 30 pounds, there is yet great variation
among them. Make an array of the means of all your samples. If there
are enough of them, group them into a frequency distribution like table
3.4.1.
Our laboratory means ranged from 19 to 39 pounds, perhaps to the
novice a disconcerting variability. To assess the meaning of this, try
to imagine doing an experiment resulting in one of these more divergent
mean gains instead of the population value, 30 pounds. Having no infor-
mation about the population except that furnished by the sample, you
would be considerably misled. There is no way to avoid this hazard.
One of the objects of the experimental samplings is to acquaint you with
the risks involved in all conclusions based on small portions of the aggre-
gate. The investigator seldom knows the parameters of the sampled
population; he knows only the sample estimates. He learns to view his
experimental data in the light of his experience of sampling error. His
judgments must involve not only the facts ofhis sample liut all the related
iilformation which he and others have accumulated.
The more optimistic draw satisfaction from the large number of
means near the center of the distribution. If this were not characteristic,
sampling would not be so useful and popular. The improbability of
getting poor estimates produces a sense of security in making inferences.
Fitting the normal distribution. In constructing table 3.4.1, one-pound
class intervals were used. Since all the means come out exactly. to one
decimal place, the class limits were taken as 19.5--20.4, 20.5--21.4, and
so on.
From theory, the distribution of.sample means should be very close
to normal, with mean J.I = 30 pounds and standard deviation ax = 1O( ,! 10
= 3.162 pounds. The theoretical frequencies appear in the right-hand
n
TAIlLE 3.4.1
FREQUENCY DlsnJaUTION OF 5 JJ MEANS Of' SAMPLIl! OF 10 DAA WN FROM
THE PIG GAINS IN TABLE 3.2.1

Oass limits Observed Theoretical


(Pounds) Frequency Frequency

Less than 19.5 I 0.20


19.5-20.4 I 0.46
20.5-21.4 0 1.12
21.5-22.4 7 2.56
22.5-23.4 5 5.47
23.5-24.4 10 10.48
24.5-25.4 19 18.09
25.5-26.4 30 28.46
26.5-27.4 41 40.52
27.5-28.4 48 52.12
28.5-29.4 66 60.76
29.5-30.4 72 64.18
30.5-31.4 56 61.32
31.5-32.4 46 53.25
32.5-33.4 45 41.65
33.5-34.4 22 29.59
34.5-35.4 24 19.11
35.5-36.4 12 11.09
36.5-37.4 5 5.88
37.5-38.4 0 2.76
Over 38.5 I 1.94

Total 511 511.01

column of table 3.4.1. To indicate how these are computed, let us check
the frequency 28.46 for the class whose limits are 25.5-26.4. First we
must take note of the fact that our computed means are discrete, since
they change by intervals of 0.1, whereas the normal distribution is con-
tinuous. No computed mean in our samples can have a value of, say,
25.469, although the normal distribution allows such values. This dis-
crepancy is handled by regarding any discrete mean as a grouping of all
continuous values to which it is nearest. Thus, the observed mean of
25.5 represents all continuous values lying between 25.45 and 25.55.
Similarly, the observed mean 26.4 represents the continuous values be-
tween 26.35 and 26.45. Hence for the class whose discrete limits are
25.5 and 26.4, we take the true class limits as 25.45 and 26.45. When
fitting a continuous theoretical distrihution to an observed frequency
distribution, the true class limits must always be found in this way.
In order to use the normal table, we express the true limits in standard
measure. For X = 25.45, Jl = 30, (l.T = 3.162. we have
2I = (X - Jl)/(lx = (25.45 - 30)/3.162 = - 1.439
For X = 26.45, we find 2, = - 1.123. From table A 3 (p. 548) we read
the area of the normal curve between - 1.123 and - 1.439. By symmetry.
72 Cltapter 3: Experimeflfal Somp/illfl From a HormoI Papulation
this is also the area between 1.123 and 1.439. Linear interpolation in the
table is required. The area from 0 to 1.43 is 0.4236 and from 0 to 1.44 is
0.4251. Hence, by linear interpolation, the aroa from 0 to 1.439 is

(0.9)(0.4251) + (0.1)(0.4236) = 0.4250.


Similarly, the area from 0 to 1.123 is 0.3693 so that the required area is
0.0557. Finally, since there.are 511 means in the frequency distribution,
the theoretical frequency in this class is (511)(0.0557) = 28.46.
To summarize, the steps in fitting a normal distribution are: (i) Find
the true class limits. (ii) Express each limit in standard measure, getting
a series of values Z" Z" Z;, .... (iii) From table A 3, read the areas
from 0 to Z" 0 to Z" 0 to Z3' . . .. (iv) The theoretical probabilities in
the classes are the areas from - 00 to Z" from Z, to Z" from Z, to Z3'
and so on, ending with the area from Z. to + 00, where Z. is the lower
limit of the highest class. The area from - 00 to Z, is 0.5 - (area from
o to Z,), and the area from Z. to + 00 is 0.5 - (area from 0 to Z.). The
intermediate areas are all found by subtraction as in the numerical illus-
tration. The only exception is the area that straddles the mean, say from
2. to 2.+ ,. Here, 2. will be negative and 2.+, positive. In this case we
add the area from 0 to 2. and that from 0 to Z.+ ,. (v) Finally, mUltiply
each area by the total observed frequency.
If you have used the same class limits as in table 3.4.1 but have drawn
a different number of samples, say 200, multiply the theoretical frequencies
in table 3.4.1 by 200/511 to obtain your comparable theoretical fre-
quencies. If you used two-pound classes, as is advisable with a smaller
number of samples, add the theoretical frequencies in table 3.4.1 in ap-
propriate pairs and multiply by the relative sample sizes.
It is clear from table 3.4.1 that the observed frequencies are a good
fit to the theoretical frequencies.
3.5-Sampllag distributions of,' and,. For each sample, calculate
" by the shortcut formula,
s' = {Ll" - (l;X)'/IO}(9
Four values of s' are shown in table 3.3.1. Three of them overestimate
,,' = 100, while the fourth is notably small. Examine any of your samples
with unusual 5' to learn what pecUliarities of the sample are responsible.
The freakish sample 3 in the table has a range of only 39-30 = 9 pounds,
with not a single member less than Jl. This sample gave the smallest s'
that appeared in our set of 511 values.
The distribution of 5' in our 511 samples is displayed in table 3.5.1.
Notice its skewness, with bunching below the mean and a long tail above--
resembling the chi-square distribution of chapter I, though less extreme.
Despite this, the mean of the values of 5' is 101.5, closely approximating
the population variance, 100, and verifying the fact that s' is an unbiased
estimator of ci'.
13
TABLE 3.5.1
OBsatVED ANI) TIu!ounCAL DJsD.IlIUT1ONS OF 511 MEAN SQu. . . . r. OF NoaMAL
SAlIIPLES WITH n .. 10
20 40 60 10 100 120 140 160 110 200 220 240 2$) 210 lOG )20 :wo

-~"'"
0 ... 12 ., 92 " n 13 42 29 16 II

12.1 50.• 14.' 94.7 14.5 65.2 4,., 29.6 11.4 10.1 6.1 U
I 2 1 0

3.S·
I 1 I

Our distribution of s. shown in table 3.5.2. has a slight skewness


(not as large as that of s') as well as a small bias. with mean 9.8 pounds.
slightl y less than a = 10 pounds. Even in samples as small as 10 the bias is
unimportant in a single estimate s.
TABLE 3.5.2
FJUlQUENCY D1SD.IBUTION OF 511 SAMPLE STANDARD 'DEv1A.'I1ONS Cou.i!IPoNDJNG TO
THE MEAN SQUA.RJ!S OF TABLE 3.5.1

Clas!.mark 3 4 5 6 7 8 9 10 II 12 13 14 15 16 17 18

Frequency 2 9 18 58 77 80 71 79 44 41 17 8 3 2

The theoretical distribution 0/8'. We have already mentioned that the


distribution of .' in normal samples is closely related to the chi-square
distribution. First. we give a general definition of the chi-square distribu-
tion. If Z" Z, • ... Z, are independently drawn random normal deviates.
the quantity
x' = Z,' + zl + ... + Z/
follows the chi-square distribution with/degrees of freedom. Thus. chi-
square with / degrees of freedom is defined as the distribution followed
by the sum of squares of/independent normal deviates. The form of this
distribution was worked out mathematically. It could. alternatively. be
examined by experimental sampling. By expressinll the 100 gains in
table 3.2.1 in standard measure. we would have a set of normal deviates
from which we could draw samples of size f, computing / as defined
above for each sample. For more accurate work. there are tables of
random normal deviates (I )(2). that provide a basis for such samplings.
Table A 5 (p. 550) presents the percentage points ofrhe x'distribution. It
will be much used at various points in this book.
A second result from theory is that if 5' is a mean square with / de-
grees of freedom. computed from a normal population that has variance
,,'. then the quantity /s' /,,' follows the chi-square distribution with / de-
grees of freedom. This is an exact mathematical result. Since our sample
variances have (n - I) df. the relation is
X'.= (n - 1)8'/17'
T4 CIoopt.r 3: &perimertfol Somplin9 From a Normal Populat;""
We cannol presenl a proof of Ihis resull. bul a lillie algebra makes Ihe
relation between s' and X' clearer. Remember that (n - I )s' is the sum
of squares of deviations. L(X - X):. Introduce I' as a working mean.
From the identity for working means (section 2.10) we have

In - l)s' (X, - p)' (X, - p)' (X. - PI' niX - p)'


'-----H-'.-'-- = -
...
2
t1
+ n
, + . . . + --,--
(J
- . (f
,
Now, the quantities (X, -1')/0', (X, -1')/0'•... (X. -1')/0'. are all in stan-
dard measure: in other words. they are random normal deviate;. And
the quantity -In(X - 1')/0' is another normal deviate. since the standard
deviation of X is t1/Jn. Hence we may write
(n - 1)s' " , ,
'-----0'-,.-'-- = Z, + Z, + ... + Z. - Z. + ,

Thus. (n - I )s' /0" is the sum of squares of n normal deviates. minus the
square of one normal deviate. whereas X'. with (n - I) dJ. is the sum of
the squares of (n - 1) normal deviates. It is not difficult to show mathe-
matically that these two distributions are the same in this case.
The theoretical frequencies for our 511 values of S2 appear in the
last line of table 3.5.1. Again. the agreement with the observed frequen-
cies is good. For filling this distribution. lable A 5 is nol very convenient.
We used the table in reference (3). which gives, for specified values of X'.
the probability of exceeding the specified value.
From the definition of the chi-square distribution. we see that chi-
square with I degree of freedom is the distribution followed by the square
of a single normal deviate. Later (chapter 8) we shall show that the chi-
square test criterion which we encountered in chapter I when testing a
proportion is approximatel~ distributed as Ihe square of a normal
deviate ..
Like the normal distribution. the theoretical distribution of chi-
square is continuous. "Unlike the normal, X2 • being a sum of squares.
cannot take negative values. so that the distribution extends from 0 to 7:;,
whereas the normal. of course, extends from - Xl to + x. An important
result from theory is that the mean value of X' withfdegrees of freedom is
exactly f. Since .,' = X',,' If; a consequence of this result is that the mean
value of.\"'. in its theoretical distribution. is exactly 0'2 This verifies the
result mentioned in chapter 2 when we stated that s' is an unbiased
estimator of 0", The property that ,,' is unbiased does not require
normality, but only that the sample be a random sample.

3.6-Interval estimates of 0". With continuous popUlations, our at-


tention thus rar has centered on the problem of estimating the population
mean from a sample. In studying the precision of measuring instruments
and in studying variability in pOfulations, we face the problem of estimat-
ing the population variance 0' from a sample. If the population is
75
normal, the x' table can be used to compute a confidence interval for ,,'
from a sample value 5'.
The entries in the chi-square table (p. 550) are the values of X' that
are exceeded with the probabilities st,Hed at the heads of the columns.
For a 95% confidence interval, the relevant quantities are X' 0.915, the
value of the chi-square exceeded with probability 0.975, and X' 0.025' the
value of chi-square exceeded with probability 0.025. Hence, the prob-
ability that a value of / drawn at random lies between these two limits is
0.975 - 0.025 = 0.95. Since X' = is'/"', the probability is 95% that
when our sample was drawn,
, is',
X 0.975 :$; ~ .:$ X 0.025
([

Multiplying through by,,', we have

(['X'o .• " sfs' i> u'X'O.025


The reader may verify that these inequalities are equivalent to the fol-
lowing,
Is'
_,_ _ $ tTl ~ -,--
Is'
X 0.02.5 X 0.97.5
This is the general formula for 95~;' cOJlfidence limits. With s' computed
from a sample of size n, we have f = (n - I), and is' is the sum of squares
of deviations, l:x'. The simplest form for computing is, therefore,

l:x' l:x'
_ ,_ _ ::; (J2 ~ - , - -
X 0.025 X. 0.<;-15

As an illustration we shall set contidence limits on ,,' for the popula-


tion of vitamin C concentrations sampled in section 2.4. For these data,
l:x' = 254, d,f. = 16, ·s' = 15.88. From table A 5, X' 0 .• " = 6.91 and
X' 0.025 = 28.8. Sub&tituting,
254 , 254
28.8 S ([ .; 6.91 '

that is,
8.82 ,; ([' ,;;; 36.76,
gives the confidence interval for (['. UIIless a l-in-20 chance has occurred
in the sampling, (1' lies between 8.82 and 36.76. To obtain confidence
limits for (1, take the square roots of these limits. The limits for (1 are 2.97
and 6.06 mg./lOO gm. Note that s "" 3.98 is not in the middle of the
interval, since the distribution of sis st<:ew.
16 Chapter 3, Experimental Sompling From a Normal Population
Large samples are necessary if (1 is to be estimated accurately. For
illustration. assume that by an accurate estimate we mean one that is
known, with confidence probability 95%, to be correct to within ± 10?/0"
If our estimate, is 100. the confidence limits for (1 should be 90 and 110.
Consider a sample of size IOI, giving 100 dj. in 5'. From the last line of
table A 5. with,' = 10,000, the 95% limits for G' are 7,720 and 13,470.
so that those for G are 87.9 and \\6, Thus, even a sample of 101 does not
produce limits that are within 1Oo/~ of the estimated value. For a sample
of size 30, with 5 = 100, the limits are 80 and 134. The estimate could be
in error by more than 20,%.
The frequency distribution of .1" is sensitive to non-normality in the
original population, and can be badly distributed by any gross errors that
occur in the sample. This effect of non-normality is discussed further in
section 3,15.
3,7 -Test of a null hypothesis value of G'. Situations in which it is
necessary to tt:st \\"hether a sample value of S2 is consistent with a postu-
lated population value of G' are not too frequent in practice. This prob-
lem does arise, however. in some applications in which (J2 has been ob-
tained from a very large sample and may be assumed known. In others,
in genetics for example, a value of (1' may be predicted from a theory that
is to be tested. The following examples indicate how the test is made.
Let the null hypothesis value of (1' be 0'0'. Usually, the tests wanted
are one-tailed tests. When the alternative is (J2 > 0'0 2 , compute

This value is significant, at the 5% level, if it exceeds /0.0'0 withf degrees


of freedom. Suppose that an investigator has used for years a stock of
inbred rats whose weights have Go = 26 grams. He considers switching
to a cheaper source of sUpply of rats, except that he suspects that the new
rats will show greater variabiJity. An experiment on 20 new rats gave
:Ex' = 23,000. 5 = 35 grams, in line with his suspicions. As a check he
tests the null hypothesis: 0' = 26 grams. against the alternative: 0' > 26
grams.
, 23,000
X = (26)"- = 34.02, df = 19

In table A 5, X' 0.0'0 is 30.14, so that the null hypothesis is rejected.


To test H A : G' <Go', reject at the 5% level if z' </0.9<0' To
iJlustrate, a standard method of performing an intricate chemic<.I1 analysis
gives 0'0 = 4.9 parts per I,CX>O for the content of some chemical COI1-
stituent. A refinement on the analysis. which may improve the precision
and cannot make it worse, gave 5 = 4.1, based on 49 df We have
77
r! = (49)(4.1)2(4.9)' = 34.3. Table A 5 gives X2 = 34.76 for I = 50 and
26.51 for/= 40. Interpolating linearly, we find X'0.950 = 33.9 fori = 49.
Formally, the null hypothesis would not be rejected, though the sig-
nificance probability is very close to 5%.
If H .. is the two-sided alternative ,,' # "0', the region of rejection is
X' < x' 0.975 and X' > X' 0.01"
EXAMPLE 3. 7. I-For the fitted normal distribution in table 3.4.1, verify the theoretical
frequencies(i) 1.94 for the class "Over 38.S" and (ii) 64.18 for the class "29.5--30.4,"
EXAMPLE 3.7.2-lf half the standard deviations in table 3.5.2 were expected to be
less than II .. 10 pounds, as would be true if s were symmetrically distributed about (T, cal-
culate X" ~ 4.89, with I dj. for the sample. The fact that X2 is significant is evidence against
a symmetrical distribution in the population.
EXAMPLE 3.7.3-10 a sample of 61 patients. the amount of an anesthetic required to
produce anesthesia suitable for surgery was found to have a standard deviation (from patient
to patient) of s = 10.2 mg. Compute 90% confidence limits for (I, Ans. 8.9 and 12.0 mg.
Use X2 0.950 and X1 0.050·
EXAMPLE 3.7.4---With routine equipment like light bulbs, which wear out after a
time, the standard deviation of the length of life is an important factor in determining whe'her
it is cheaper to replace all the pieces at fixed mtervals or to replace each piece individually
when it breaks down. For a certain gadget, an industrial statistician has calculated that it
will pay to replace at fixed intervals if (J < 6 days. A sample of 71 pieces gives 5 = 4.2 days.
Examine this question (i) by finding the upper 95"10 limit for (J from s, (ii) by testing the null
hypothesis t1 = (10 = 6 days against the alternative (J < (. days. Ans. (i) The upper 95'4
limit is 5.0 (ii) Ho is rejected at the 5% level. Notice that the two procedu~es are equivalent;
jf the upper confidence limit bad been 6.0 days, the chi-square value would be at the 5%
significance level.
EXAMPLE 3.7.5-For df greater than 100, which are not shown in table A 5. an ap-
proximation due to R. A. Fisher is that J2X 1 is normally distributed with mean J'2J'.:. : i
and standard deviation I. Check. this approximation by finding the value that it gives for
'1. 2 0.015 when! = JOO, the correct value being 129.56. Ans. 129.1.

3.8-The distribution of t. Returning to our experimental samples,


we are ready to examine the t-distribution for 9 degrees of freedom.
Since X and Sll have already been calculated for each of your samples of
10, the sample value of t may now be got by putting" = 30, the formula
being
t = {X - 30)(sx "
Here, t will be positive or negative according as X is greater or less than
30 pounds. In the present sampling the two signs are equally likely, so
you may expect about half of each. On account of this symmetry the
mean of all your t should be near zero.
The four samples in table 3.3.1 were selected to illustrate the manner
in which large, small, and intermediate values of t arise in sampling. A
small deviation, (X - ,,), or a large sample standard error tend to make t
small. Some striking combinations are put in the table, and you can
doubtless find others among your samples.
78 CIIapIer 3: &,_;..,.",'" Sampling From " HonnaIl'ap.'....
TABLE 3.8.1
SAMPLE AND THEORETICAL DISTRIBUTIONS OF I. SAMPLES Of 10.
DEGREES OF FREEDoM, 9

Interval of t Sample Theoretical

Cumulative
Pen:entagc Percentage
From To Frequency Frequency Frequency OncTail Both Tails
....... -3.2SO 3 O.~ 0.5 100.0
-3.2SO -2.821 4 0.8 0.5 99.5
-2.821 -2.262 5 1.0 I.S 99.0
-2.262 -1.833 16 3.1 2.S 97.5
-1.833 -1.383 31 6.1 5.0 95.0
-1.383 -LlOO 25 4.9 5.0 90.0
-LlOO -0.703 52 10.2 10.0 85.0
-0.703 0.0 132 25.8 25.0 75.0
0.0 0.703 126 24.<1 25.0 SO.O 100.0
0.703 1.100 41 8.0 10.0 25.0 SO.O
1.100 1.383 32 6.3 5.0 15.0 30.0
1.383 1.833 18 3.S 5.0 10.0 20.0
1.833 2.262 13 2.S 2.S 5.0 10.0
2.261 2.821 8 I.~ 1.5 2.5 5.0
2.821 3.2SO 2 0.4 0.5 1.0 2.0
3.250 ..... 3 O.~ 0.5 0.5 1.0

511 100.0 100.0

The distribution of the laboratory sample of I is displayed in table


3.8.1. The class intervals in the present table are unequal, adjusted so as
to bring into prominence certain useful probabilities in the tails of the
distribution. The theoretical percentage frequencies are recorded for
comparison with those of the sample. The agreement is remarkably good.
In the last two columns are the cumulative percentage frequencies which
make the table convenient for confidence statements and tests of hy-
potheses. Examination of the table reveals that 2.5~~ of all I-values in
samples of 10 theoretically fall beyond 2.262, while another 2.5% of values
are smalier than - 2.262. Combining these two tails of the distribution,
as shown in the last column. 5°" of all t in samples of 10 lie further from
the center than 12.2621. which is therefore the 5°'~ level of t. Make a dis-
tribution of your own sample t to be compared with the theoretical
distributions in the table.
Our I-Iable, fable A 4, is a two-tailed table because most applications
of the I-distribution cali for two-sided confidence limits and two-tailed
lests of siglllficancc. If you need a table that gives the probability for
specified values of I inslead of I for specified probabilities. see (4).

3.9-The interval estimate of I'; the confidence interval. The theory of


the confldence interval may now be verified from your sampling. Each
79
sample specifies an interval, X ± 1•.,!!.,SX, said to cover Jl. In each of your
samples, substitute the estimators, X and sx, together with / •.• , = 2.262,
the 0.05 level for 9 df. Finally, if you say, for any particular sample, that
the interval includes Jl you will be either right or wrong; which it is may be
determined readily because you know that Jl = 30 pounds. The theory
will be verified if about 95% of your statements are right and about 5%
wrong.
Table 3.3.1 (p. 69) gives the steps in computing confidence limits for
four samples. The intervals given by these four samples are, respectively,
26.3 to 44.9
20.5 to 38.1
31.9 to 36.3
11.5 to 26.7

Sample 1 warrants the statement that Jl lies between 26.3 and 44.9
pounds, and we know that this interval does contain Jl, as does likewise
the interval from sample 2. On the contrary, samples 3 and 4 illustrate
cases leading to false statements, one because of an unusually divergent
sample mean, the other because of a small sample standard deviation.
Sample 3 is particularly misleading: not only does it miss the mark, but
the narrow confidence interval suggests that we have an unusually ac-
curate estimate. Of the 511 laboratory samples, 486 resulted in correct
statements about Jl; that is, 95.1 % of the statements were true. The per-
centage of false statements, 4.9%, closely approximated the theoretical
5%. Always bear in mind the condition involved in every confidence
statement at the 5% level-it is right unless.a l-in-20 chance has occurred
in the sampling.
Practical applications of this theory are by people doing experiments
and other samplings without knowledge of the population parameters.
When they make confidence statements, they do not know whether they
are right or wrong-they know only the probability selected.
EXAMPLE 3.9.1-Using the sample frequencies of table 3.8.1, test the hypothesis
(k.nown to be true) that the (-distribution is symmetrical in the sense that half of the popu-
lation frequency is greater than zero. Ans./' = 1.22. .

£XA.MPLE !>,9-.2-F'im\\\a.b'.t. :,.,t.\ \\\~~\ha\ 3- T '" +- 5 + % T 1 ,_!> = lSsamp\a


have III> 2.262. Test the hypothesiS that S% of the population values are greater than
12.2621· Ans. X' = 0.0124.

EXAMPLE 3.9.3-ln table 3.8.1. accumulate the sample freq~encies in both tails and
compare their percentage values with those in the last column oflhe table.

EXAMPLE 3.9,4-0uring the fall of 1943, approximately one in each 1,000 ,city
families of Iowa (cities are defined as having 2.500 inhabitants or more) was 'visited to learn
the number of quarts of food canned. The average for 300 families was 165 quarts with
standard deviation. 1"53 quarts. Calculate the 95':~ confidence limits. Ans. 165 ± 17 quarts.
80 Chapt.... 3: Experi_n,aI Sornpling From .. Norm'" Popul.mo.
EXAMPLE 3.9.5- The 1940 census reported 312.000 dwelling units (rouply the same
as families) in Iowa cities. From the statistics of the foregoing example, estimate the nllm-
ber of quarts of food canned in Iowa cities in 1943. Ans. 51.500.000 quarts with.95%con~
tidence limit!.. 46,200,000 and 56,800.000 quarts.

3.lO-Use of frequency distributions for computing j and s. In this


chapter we have used frequency distributions formed by grouping the
sample data into classes to give a piClure of the way in which a variable is
distributed in a population. A frequency distribution also provides a
shortcut method of computing X and s from a large sample. For this
calculation, at least 12 classes are advisable, and for highly accurate work,
at least 20 classes. The reason will be indicated presently.
After forming the classes and counting the frequency in each class,
write down the class mark (center of the class) for each class. Normally,
the class mark is found by noting the lower and the upper limits of the
class, and taking the average of these two values. For instance, with data
that are originally recorded to whole numbers, the class limits might be
0-9, 10-19, and so on. The class marks are 4.5, 14.5, and so on. Note
that the marks are not 5, 15, etc., as we might hastily conclude.
The assumptions made in the shortcut computation are that the
class mark is very close to the actual mean of the items in the class, and
that these items are approximately evenly distributed throughout the
class. These assumptions are likely to hold well in the high-frequency
classes near the middle of the distribution. Caution is necessary if there
are natural groupings in the scale of measurement. An instance was ob-
served where the number of seed compartments in tomatoes was the
variable, its values being confined to whole numbers and halves. How-
ever, halves occurred very Infrequently. At first, the class intervals were
chosen to extend from 2 up to but not including 3, etc., the class marks
being written down as 2 1/2, 3 1/2, etc. Actually, the class means were
alnwst at the lower boundaries, 2, 3, etc. This systematic error led to an
overestimate of almost half a seed compartrnont in the mean. In this
situation the actual class means should be computed and used as the class
marks (see exercise 3.11.3).
The same problem can arise in the extreme classes in a frequency
distribution. To revert to the example with intervals 0-9, 10-19, etc. and
class marks taken as 4.5, 14.5, etc., we might notice that the lowest class
contained six O's, one 2, and one 6, so that the class mean is actually 1.0,
whereas the class marjc is 4 5. For accurate work the class mark for this
class is taken as 1.0.
In the shortcut computation of X and s, each item in the sample is
replaced by the class mark for the class in which it lies. AIl values be-
tween 10 and 19 in the previous example are replaced by 14.5. The process
is exactly the same as that of rounding to the nearest whole number, or
the nearest 100. This rounding introduces an additional error into tbe
data. The argument for having a relatively large number of classes is to
keep this error small.
81

The remainder of this section discusses how much accuracy is lost


owing to this rounding error. Let X represent any item in the sample and
let X' be the corresponding class mark or rounded value. Then we may
writ.
X' = X +e
where e is the rounding error. If [is the width of the class interval. the
values e are assumed to be roughly evenly distributed over the range from
- 112 to + [12. An important result from theory is that the variance of
the sum of two independent variables is the sum of their variances. This
gives
t1X,l = t1xl + (1~'l

If e is uniformly distributed between -[12 and + [12, it is known from


theory that its variance is 12/12. Hence,
a x .2 = ax 2 + [2/12 = a 2 + 1'/12,
since a x2 is the original population variance a 2 •
Consequently, when a value X is replaced by the corresponding class
mark X', the variance is increased by [2/12 due to the rounding. The rela-
tive increase in variance is [2/12a 2. We would like this increase to be
small,
Suppose that there are 12 classes in the frequency distribution. If
the distribution is not far from normal, nearly all the frequency lies within
a distance ± 3a from 1'. Since these classes cover a range of 6a. [ will
be roughly 60/12 = a/2. Thus the relative increase in the variance of
X due to grouping is about 1/48, or 2%. A further analysis, not presented
here, shows that the computed S'2 has a variance about 4% larger than
that of the original .2 (5). For ordinary work these small losses in ac-
curacy to save time in computation are tolmble. For accurate work. the
adVIce commonly given is that [snouid not exceed a/4. This reqw'res
about 24 classes to cover the frequency distribution when the sample is
large.
With a iiiscrete variable, there is -often no rounding and no loss of
accuracy in using a frequency distribution to compute the sample mean
and variance. For instance, in a study of accidents per week, the number .
of accidents might range only from 0 to 5. The six classes 0, I, 2, 3, 4. 5:
give a complete representation of the sample data without any rounding.

3.Il-ComputatioD of flUId. in large samples: example. The data


in table 3.11.1 come from a sample of 533 weights of swine, arranged in
22 classes. The steps in the calculation of X and s are given under the table.
A further simplification comes from coding the class marks, as shown
in the third column. Place the 0 on the coded scale at or near the class
mark that has the highest frequency. We chose this origin at G = 170
pounds. The classes above this class arc coded, as I, 2, 3, etc.; those
TABLE l.ll.I
FuQUENCY DISTllIBUTlDN OF LIVE WEJGHTS OF 533 SWJNE. COMPUTATION OF MEAN
AND StANDARD DEVIATION. 1- 10 PoUNDS. G = 170 PouNDS
Sum of
Class Mark. Frequency Code Numbers Code Numbers Squares
Pounds f U fU fU'

80 1 - 9 - 9 81
90 0 - 8 0 0
100 0 - 7 0 0
llO 7 - 6 -42 252
120 18 - 5 -90 4SO
130 21 - 4 -B4 336
140 22 - 3 -66 198
ISO 44 - 2 -88 176
160 67 - I -67 67
170 76 0 0 0
1110 55 I 55 55
190 57 2 114 22B
200 47 3 141 423
210 33 4 112 528
220 30 5 ISO 7SO
230 23 6 138 B2B
240 II 7 77 539
2SO S B 40 320
260 S 9 45 405
270 4 10 40 400
280 5 II 55 60S
290 2 12 24 28B

" = 533 lJfU = 565 r,[Ul "'"' 6,929

1:fU= S65 1:fU' = 6,929


(IfU)'(" - (565)'(Sll = 59B.92
IU ~ 1.CY;S6~~11\
= 10.6 pounds 1:.' = 6,330.08
X=G+IU ~V2 = 1:.2 /(" - 1)-= 11.8986
= 170 + 10.6 Sv == 3.45
= 180.f! pounds $ -I!v = (IO),sv == 34.S pounds

below as - I, - 2, - 3, etc. It is importa'lt to know the relation between


your original and your coded class marks. If X(dropping the prime) is an
original class mark and U is its coded value, this relation is
X=G+lU
where I is the width of the class interval (10 pounds in this example). To
verify the rule, when U is - 5, what is X? We have, X = 170 + (10)( - 5)
= 120, as appears in column I.
In the computations we first find the sample mean and variance of
U, namely i7 and so. From the above relation we get
K=G+IU
83
and S = Sx = lsu
With these relations the steps given under table 3.11.1 are easily fol-
lowed. With a computing machine the individual values jV2 need not be
written down. Their sum can be found by taking the sum of products of
the column U with the column fU. The individual values jV are required;
pay anention to their signs when adding them.
Note that s is 3.45 times the class interval 1, so that the loss of ac-
curacy due to the use of class marks is trivial.

Sheppard's Correction. From the theory presented in the previous sec-


tion, a consequence is that 52, as computed in table 3.11.1, is an estimate
or (12 + [2/12, rather than of (12 itself. A correction introduced by
W. F. Sheppard (6) is to subtr~ct [2/12 from the value of S2. in order to
obtain a more nearly unbiased estimate of ,,2 In this example, with
S2 = 1,189.86, thecor<ection amounts to only 100/12, or 8.33. The cor-
rected value of 5 is 3.44 as against our computed 3.45. The correction is
seldom substantiaL The corrected value should not be used i.n a test of
significance (7).

EXAMPLE J.II.I-The data: show the frequency distribution of the heights of 8,585
men, arranged in ten 2·in. classes. The number of classes is \00 sma1l for accurate work, but
gives an eas)' exercise. Compute K and's. using a convenient coding. Ans. X = 67.53 in ..
s = 1.62 in.

Class Class
Mark (in.) Frequency Mark (in.) Frequency

58 6 68 2.559
60 55 70 1,709
62 252 72 594
64 1.063 74 III
66 2.213 76 23

EXAMPLE 3.11.2~Apply Sheppard's correqlon and report. the corrected 5. Ans.


2.56 ins .

. ....
EXAMPLE 3.11.3-- ThiS baby example illustrates how the accura,"y 'of the !Ihortcut
method improves when the clas~ marks are the means of the items in thec:la:-.ses. The original
data consist of the fourteen values: 0, 0, 10. 12. 14, 16~20. 22, 2'4, 25, 29, 32, 34: 49. (il Com-
pute X and s directly from these data. (ii) Form a frequency distribution with cla~se!. 0- 9.
10-19. 20-29. 30-39, and 40--4.,.. Cpmpute X and .f from tht: I,;onvenuonal class marks,
4.5.14.5,24.5,34.5, and 44.5. (iii) In the same frequency distribution. fin'~ the actual mean!>
of the items in t:"ach class, and use these rr.eans as the class marks. (Coding doesn't help he{e,)
Ans. (i) X ~ 20.5 • .\' = Ll..a. Iii) X =- 21.6. s =' 11'.4. both qUIte inaccurate. (IiI) X ': 2(1.5.
J = 13.1. Despite the rounding errors that contribute IP this s. it is smaller than the onglnal
s in (i). This is an effect of sampling-error in this small sample.
EXAMPLE 3.11.4--The yields in grams of 1.499 rows of wheat are recorded by Wiebe
(9). They have been tabulated as follows:
------
Class Mark lirequeocy Chrss Mark Frequency i Class Mark Frequency

975
400 d
3 600
625
127
14()
-r 825
850
·10
10
425 41 650 122 875 4
450 99 675 94 900 4
475 97 700 64 925 2
500 il8 725 49 950 3
525 138 750 31 I 975 I
146 775 26 i 1.000 1

Gtal
550
575 136 800 20
1.499

ComptCte X ,= 587.74 grams. and s = 100.55 grams. Are there enough classes in this dis-
tribution'? .

3.12-Tests of normality. Since many of the standard statistical tech-


niques are based on the assumption of normality, methods for judging the
normality of a set of data are of interest. In this and in the following
sections. three tests will be illustrated from the frequency distribution'of
means of samples of 100 drawn from the population of city sizes in section
2.12 (-p. 51). The histo'gram of this frequency distribution, shown in the
boltom part Of figure 2.12.2. p. 55, gave· the impression that a normal
distribution would not be a good fit. We can now verify this. impression
in a quantitative manner.
In the tirst test. often called the X. goodness oj fil lesl, the data are
grouped into classes to form a frequency distribution and the sample
me:"" X ~.WJ $l.>mivD Dey.wtiD." _'~.rr'C~.Ic>.)hj"D. Fn\w t.bt:se ~.Njj.;"S, .a
n'Ormal distribution is litted and the expected frequencies in each class
are ohtained as desCribed in section 3.4 (p. 70). Table 3.12.1 presents the
obseI:Ved frequencies.r. and the expected frequencies £,:
For each class, compute and record the quantity
if. - £,)'/F; = (Obs. - Exp.)'/Exp.

The tesl criterion is


x' = E(J. - £,)'/Fj
s~mmed over the classes. If the data actualiy come from a normal dis-
tribution, this quantity follows approximately the theoretical 'l,' distribu-
tion with (k - 3) dj., where k is the number of classes used in computing
X' If the data come· from some other distribution, the observed J.
. will tend to agree poorly with the values of F, that are expected on the
assumption of norinality, and the computed X' becomes large. Conse'
queillly, large values of'l,' cause rejection "I' the hypothesis of normality.
'5
TABLE 3.12.1
CALCULATION Of THE GOODNESS Of FIT X2 FOR THE DISTRIBUTION Of MEANS Of
SAMPLES Of 100 CiTY SIZES
- .
Frequencies

Class Limits Obs. Exp.


(I.00I)"s) J. F, If, - F,)' I F,

Under 129 9 20.30 6.29


1311-139 35 30.80 0.57
140-149 68 55.70 2.72
1511-159 94 80.65 2.21
1611-169 90 93.55 0.13
1711-179 76 87.00 1.39
1811-189 62 64.80 0.12
190-199 28 38.70 2.96
2011-209 27 18.55 3.85
2111-219 4 7.10 1.35
2211-229 5 2.20}
2311-239 I 0.50 6.04
240- I 0.15

Total . , 500 500.00 27.63

x' ~ 27.63. df. ~ II - J ~ 8. P < 0.005

The theorem that this quantity follows the theoretical distribution of


X' when the null hypothesis holds and that the degrees of freedom are
(k - 3) requires advanced methods of proof. The subtracted number 3 in
the df may be thought of as the number of ways In which the observed
and expected frequencies have been forced to agree in the process of
fitting the normal distribution. The numbers /; "nd F, both add to 500
and Ihe sets agree in the values of X and ., that they give.
The theorem. also requires that the expected numbers not be toO
small. Small expectations are likely to occur only in the extreme classes.
A working rule (10) is that the two extreme expectations may each be as
low as I. provid~d that most of the other expected values exceed S. In
table 3.12.1, small expectations occur in the three highest classes. In this
event. classes are combined to give an expectation of at least one. The
three highest classes give a combined{, of 7 and F; of 2.8S. The contribu-
tion to X' is (4.IS)'12.85 = 6.04. .
For these data. k = II after combination. so that X' = .27.63 has g d/
Reference to table A 5 shows that the hypothesis of normality is rejected
at the OS" level. the most extreme level given in this table.
The x' test may be described as a non-specific test. in that the test
criterion is dirocled against no particular type of departure from nor-
mality. Examples occur in which the data are noticeably skew. although
the X' test does not reject the null hypothesis. An alternative test that is
deSigned to detect skewness is often U<ettas a supplement to the X' test.
86 Chapter 3: Experimental Sampling Fram a Narmal Population
3.13-A test of skewness. A measure of the amount of skewness in a
population is given by the average value of (X - 1')'. taken over the
population. This quantity is called the third moment about the mean. If
low values of X are bunched close to the mean I' but high values extend
far above the mean, this measure will be positive, since the large positive
contributions (X - 1')3 when X exceeds I' will predominate over the
smaller negative contributions (X - /1)' obtained when X is less than /1.
Populations with negative skewness, in which the lower tail is the ex-
tended one, are also encountered. To render this measure independent
of the scale on which the data are recorded, it is divided by 0"'. The r..ult-
ing· coefficient of skewness is denoted sometimes by .J PI and sometimes
by Y"
The sample estimate of this coefficient is denoted by .Jb, or gJ. We
compute
m, = l:(X - X)'/n
m, = l:(X - X)2/n
and take
.,jb , = gl = m,/(m,.,jm,)
Note that in computing ".", the sample variance, we have divided by n
instead of our customar¥ (n - I). This makts subsequent calculations
slightly easier.
The calculations are illustrated for the means of city sizes in table
3.13.!. Coding is worthwhile. Since .Jb , is dimensionless, the whole
calculation can be done in the coded scale, with no need to decode. Hav-
ing chosen coded values U, write down their squares and cubes (paying
attention to signs). The U4 values are not needed in this section. Form
the sums of products with the f s as indicated, and divide each sum by n
to give the quantities hI' h" h,. Carry two extra decimal places in the
h's. The moments m, and m, ate then obtained from the algebraic
identities given under the table. Finally, we obtain .,jb , = 0.4707.
If the sample comes from a normal population, :jb, is approximately
normally distributed with mean zero and S.D . .J(6/n), or in this case
.,j(6/500) = 0.110. ~Since.,j ~I is over 4 times its S.D .. the positive skewness
is confirmed. The assumption that oj b I is normally distributed is ac-
curate enough for this test if n exceeds 150. For sample sizes between 25
and 200, the one-tailed 5% and I~; significance levels of .,jb ,. computed
from a more accurate approximation. are given in table A 6,

3.14--Tests for kurtosis. A further type of departure from normality


is called kurtosis. In a population, a measure of kurtosis is the average
value of (X - /1)4, divided by (14 For the normal distribution, this ratio
has the value 3. If the ratio exceeds 3, there is usually an excess of values
near the mean and far from it. with a corresponding depletion of the flanks
of the distribution curve. This is the manner in which the t--distribution
11
TABLE 3.13.1
COMPUTA"tIONS FOR TESTS OF SKEWNESS AND KURTOSIS

lower Class
Limit f U u' U' U·

120- 9 -4 16 -64 256


lJO- 35 -3 9 -27 81
140- 68 -2 4 - 8 16
15().· 94 -I 1 - 1 1
160- 90 0 0 0 0
170- 76 I I I 1
180- 62 2 4 8 16
190- 28 3 9 27 81
200- 27 4 16 64 256
210- 4 5 25 125 625
220- 5 6 36 216 1.296
230- I 7 49 343 2.401
240- I 8 64 512 4.096

1/ = 500 Tesl of j'keH'nesS


TofU ~ + 86 h, = TofUln = + 0.172
TofU' = U26 h, = TofU"n = 4.452
Tofl!' = + 3.332 h, = TofU~/n = + 6.664
'"2 = hl - h,l = 4.4224 .
"') = h) - 3h 1hl + 2h,) = 4.3770
.jb, = m,lm,.jm, ~ 4.3770/(4.4224).j4.4224 = 0.4707

Test of kUrlosiS
TofU' ~ 32.046 h. ~ TofU·I. ~ 64.092
"'4 = h£ - 4h.h 3 + 6h l 2 h2 - 3h,4 =,60.2948
bl = m./m/ = 60.2948/(4.4224)2 = 3.083

departs from the normal. Ratios less than 3 result from curves that have
a flaller top than the normal.
A sample estimate of the amount of kurtosis is given by
g, = h, - 3 = (m,lm,') - 3.
where
m. ~ :!;(x- X)4/1

is the fourth moment of the sample about1tsmean. Notice that the normal
distribution value 3 has been subtracted. with the result that peaked
distribution..; show po<;;.itive kurtosis and flat-topped distributions show
negative kurtosis.
The shortcut computation of ni4 and b1 from the coded values L'is-
shown under table 3.13.1. For this sample. g, = b 2 - 3 has the value
+0.083. In very large samples from the normal distribution. g, is lIor-
mally distributed with mean 0 and S.D. ,'(24m) = 0.219. since /1 i,
SOO. The sample vahie of g, is much smaller than its standard error. so
that the amollnt of kurtosis in the population appears to ~ trivial.
88 Chapl., 3: Exp.,im.nlal Sampling Fl'Om a No,mal Population
Unfortunately, the distribution of g, does not approach the normal
closely until the sample size is over 1,000. For sample sizes between 200
and 1,000, table A 6 60ntains better approximations to the 5% and 1%
significance levels. Since the distribution of g, is skew, the two tails are
shown separately. For n = 500, the upper 5% value of g, is +0.37, much
greater than the value 0.083 found in this sample.
For sample sizes less than 200, no tables of the significance levels of
g, are at present available. R. C. Geary (1\ I developed an alternative
test criterion for kurtosis,
a = (mean deviation)/(standard deviation)
= I:!X - l'J/ny'm"
and tabulated its significance levels for sample sizes down to n = II. If
X is a normal deviate, the value of a when computed for the whole popula-
tion is 0.7979. Positive kurtosis produces higher values, and negative
kurtosis lower values of a. When applied to the same data, a and g,
usually agree well in their verdicts. The advantages of a are that tables
are available for smaller sample sizes and that a is easier to comp)!te.
An identity simplifies the calculation of the numerator of a. This will
be illustrated for the coded scale in table 3.13.1. Let
I:' = sUm of all observations that exceed U
n' = number of observations that exceed D
I:!U - D! = 2(I: - n'D)
Since U = 0.172, all observations in the classes with U = I or more
exceed D. This gives I:' = 457, n' = 204. Hence,
I:!U - U! = 2{457 - (204)(0.172)) = 843.82
Srne. m, .= 4.4224, we have
a = (843.82)/(500)y' 4.4224 = 0.802
This is little greate, than the value 0.7979 for the normal distribution,
in agreement with the result given by g,. For n = 500 the upper 5% level
of a is about 0.814.
3.tS-Effects of skewness and kurtosis. In samples from non-normal
populations, the quantities g, and g, are useful as estimates of the cor-
responding popUlation values I, and y" which characterize the common
types of non-normality. K. Pearson produced a family oflheoretical non-
normal curves intended to simulate the shapes of frequency distributions
having any specified values ofy, and I" provided that the non-normality
was not too extreme.
The quantities y, aM y, have also been useful in studying the dis-
tributions of X and s' when the original population is non-normal. Two
results will be quoted. For the distribution of X in random samples of
size n,
89
ylX") = Yl/..Jn heX) = h/n
Thus, in the distribution of g, the measures of skewness and kurtosis
both go to zero when the sample size increases, as would be expected from
the Central Limit Theorem. Since the kurtosis is damped much faster
than the skewness, it is not surprising that in our sample means g 1 was
substantial but g 2 small.
Secondly, the exact variance of S2 withf degrees offreedom is known
to be
o
VIs 2 ) = -2a { 1 Y2}
+ -f- ' -
f f+ 1 2
The factor outside the brackets is the variance of S2 in samples from a
normal population. The term inside the brackets is the factor by which
the normal variance is multiplied when the population is non-normal.
For example, if the measure of kurtosis, Y2, is I, the variance of S2 is
about 1.5 times as Jarge as it is in a normal population. With Y2 = 2, the
variance of S2 is about twice as large as in a normal population. These
results show that tbe distribution of S2 is sensitive to amounts of kurtosis
that may pass unnoticed in handling the data.
EXAMPLE 3.15.1--10 table 3,2.2, compute g. = - 0.0139 and K2 = 0.0460, showing
that the distribution is practically normal in these respects. -
EXAMPLE 3.15.2-10 table 3.5,2 is the sampling distribution of 511 standard devia-
tions. Calculate &1 = 0.3074 with standard error 0.t08. As expected, this indicates that
the distribution is positively skew.
EXAMPLE 3.15.3-Tbe SI I values of I discussed in section 3.8 were distributed as fol-
lows:
,
Class Mark f Qass Mark. f I Class Mark J \ 'Cli,s"Mork - f
I ------
-3.13 3 -1.13 29 I 0.87 31 j 2.87
-0.88 35 23 ,
-2.88
-2.63 I
5
-0.63 38 I 1.12
1.37 17
3.11
3.37 2
-2.38 3 -0.38 40 1.62 II i 3.62 0
-2.\3 6 -0.\3 52 I 1.87 ~ J.87 0
-1.88
-1.63
12
21
0.12
0.37
57
43
I 2.12
2.37
10
6
.~_, 4. I."!

4..17
()
I
-1.38 16 0.62 37 2.62 2
Total 51 I
._--_----
The highly significant value of g2 = 0.5340 shows that the frequencies near the mode and
in the tails are greater than in the normal distribution. those in the flanks being less. TlUs
was expected. But gl "'" 0.1356 is non·significant. which is also expected because the theoreti-
cal distribution of I is symmetrical.

REFERENCES
I. RAND CORPORA liON. A Afilfion Random Digits Wilh JOO.(I(){) Xo",wl Df'flJar('J. Free
Press. Glencoe, III. (1955).
90 Chapter 3: ExperinHHttal s.""plitrg From a Normal Popu/atiow
2. P. C. MAHALANOBIS, etat. Sankhya, 1: 1 (1934).
3. E. S. PEAItSON and H. O. HARTLEY, Biometrika Tables/or Statisticians. VoL I. Cam·
bridac: University Press (1954).
4. N. V. SMIRNOV. Tables for the Distribution and Density Functions of t-distribution.
Pergamon Press. New York (1961).
5. R. A. FISHEll. Phil. Tra..... , A, 222: JI)9 (1921).
6 W. F. SHEPPAlID. Proc.l..ond. Math. Soc., 29:353 (1898).
7. R. A. FISHER. Statistical Methods for Research Workers. 13th N. Oliver and Boyd,
Edinburgh (1958).
8. E. W. UNDSTRQ>I. Amer. Nat., 49:311 (1935).
9. G. A. WIEBE. J. Agric. Res., 50:331 (1935).
10. W. G. CocHRAN. Biometrics, 10:420 (1954).
Ii. R. C. G..,.y. Biometrika, 28:295 (1936).
* CHAPTER FOUR

he comparison of two samples

4.1-Estimates and tests of differences. Investigations are often de-


signed to discover and evaluate diflerences between effects rather than the
effects themselves. It is the difference between the amounts learned under
two methods of teaching. the difference between the lengths of life of two
types of glassware or the difference between the degrees of relief reported
from two pain-relieving drugs that is wanted. In this chapter we consider
the simplest investigation of this type, in which two groups or two pro-
cedures are compared. In experimentation, these procedures are often
called the treatments. Such a study may be conducted in two ways.

Paired samples. Pairs of similar individuals or things are selected. One


treatment is applied to one member of each pair, the other treatment to
the second member. The members of a pair may be two students of
similar ability; two patients of the same age andsex who have just under-
gone the same type of operation; or two male mice from the same litter.
A common application occurs in self-pairing in which a single individual
is measured on two occasions. For example, the blood pressure of a sub-
ject might be measured before and after heavy exercise. For any pair, the
difference between the measurements given by the two members is an
estimate of the difference in the effects of the two treatments or procedures.
With only a single pair it is impossible to say whether the difference
in behavior is to be attributed to the difference in treatment, to the natural
variability of the individuals, or partly to both. ·.There must be a number
of pairs. The data to be analyzed consist of a sample of n differences in
measurement.

Independent samples. This case, which is commoner, arises whenever we


wish to compare the means of two populations and have drawn a sample
from each quite independently. We might have a sample of men aged
50-55 and one of men aged 30-35, in order to compare the amounts
spent on life insurance. Or we might have a sample of high school seniors
from rural schools and one from urban schools, in order to compare
their knowledge of current affairs as judged by a special examination on
91
92 Chapter 4: The Comp,,!ison of Two Samples
this subject. Independent samples are widely used in experimentation
when no suitable basis for pairing exists, as, for example, in comparing
the lengths of life of two types of drinking glass under the ordinary condi-
tions of restaurant use.
4.2-A simulated paired experiment. Eight pairs of random normal
deviates were drawn from a table of random normal deviates. The first
member of each pair represents the result produced by a Standard pro-
cedure. while the second member is the result produced by a New proce-
dure tho t is being compared with the Standard. The eight differences,
New-St., are shown in the Column headed Case I in table 4.2.1.

TABLE 4.2.1
A SIMULATED PAIRED EXPERIMflNT

CASE 1 CASE [[ CASE 111


Pair New-St. (D,) New-St. (1:>1) New-SI. (DJ

I +3.2 +13.2 +4.2


2 -1.7 I + 8.3 -0.7
3 +0.8 +10.8 + 1.8
4 -0.3 + 9.7 +0.7
5 +0.5 + 10.5 +1.5
6 + 1.2 + 11.2 +2.2
7 -1.1 + 8.9 -0.1
8 -0.4 + 9.6 +0.6

Mean (Il) +0.28 + 10.28 + 1.28

SD 1.527 1.5~7 1.527

s. 0.540 0.540 0.540

Since the Fe.sults for the New and Standard procedures were drawn
from the same normal population, Case I simulates a situation in which
there is no difference in effect between the two procedures. The observed
differences represent the natural variability that is always present in ex-
periments. It is obvious on insp~ction that the eight differences do not
indicate any superiority of the New procedure. Four of the differences are
+ and 4 are -, and the mean difference is small.
The results in Case II were obtained from those in Case I by adding
+ 10 to every figure, to represent a situation in which the New procedure
is actually 10 units better than the Standard. On looking at the data, most
investigators would reach the judgment that the superiority of the New
procedure is definitely established, and would probably conclude that
the average advantage in favor of it is not far from 10 units.
Case III is more puzzling. We added + I to every figure in Case I,
so that the New procedure gives a small gain over the Standard. The New
procedure wins 6 times out of the 8 trials, and some workers might con-
clude that the results confirm the superiority of the New procedure.
93
Others might disagree. They might point out that is is not too unusual
for a fair coin to show heads in 6 tosses out of 8, and that the individual
results range from an advantage of 0.7 units for the Standard to an ad-
vantage of 4.2 units for the New procedure. They would argue that the
results are inconclusive. We shall see what verdicts are .uggested by the
statistical analyses in these three cases.
The data also illustrate the assumptions made in the analysis of a
paired trial. The differences D, in the individual pairs are assumed to be
distributed about a mean I'D' which represents the average difference in
the effects of the two treatments over the population of which these pairs
are a random sample. The deviations D, - I'D may be due to various
causes, in particular to inherent differences between the members of the
pair and to any errors of measurement to which the measuring instruments
are subject. Another source of this variation is that a treatment may
actually have different effects on different members of the population. A
lotion for the relief of muscular pains may be more successful with some
types of pain than with others. The adage: "One man's meat is another
man's poison" expresses this variability in extreme form. For many ap-
plications it is important to study the extent to which the effect of a treat-
ment varies from one member of the popUlation to another. This re-
quires a more elaborate analysis, and usually a more complex experiment,
than we are discussing at present. In the simple paired trial we compare
only the average effects of the two treatments or procedures over the
population.
In the analysis, the deviations D, - I'D are assumed to be normally
and independently distributed with population mean zero. The conse-
quences of failures in these assumptions are discussed in chapter II.
When these assumptions hold, the sample mean difference /) is
normally distributed about iJD with standard deviation or standard error
(lD/../n, where (10 is the S.D. of the population of differences. The value
of (lois seldom known, but the sample furnishes an estimate
_ J1:(D, - 15)2 _ f?D/ - (1:D,)2/n
SD - n _ 1 - ,,- n- 1

Hence, s1) = sol../n is an estimate of (III, based on (it- I) d.f


The important consequence of these results is that the quantity
t = (.0 - PD)/SII
follows Student's (-distribution with (n - I) d.!, where n is the number of
pairs. The (-distribution may be used to test the null hypothesis that
I'D = 0, Of to compute a confidence interval for I'D' '-

Test of significance. The test will be applied first to the doubtful Case
Ill. The values of So and SII are shown at the foot of table 4.2.1. Note
that these are exactly the same in all three cases, since the addition of a
constant I'D to all the D, does not affect the deviations (D, - /). For
Case III we have
94 Chapler 4: Th. Co_ison of Two Sampl...
I = D/sfj = 1.28/0.540 = 2.370
With 7 d,J., table A 4 shows that the 5% level of I in a two-tailed test is
2.365. The observed mean difference just reaches the 5% level, so that
the data point to a superiority of the new treatment.
In Case 11, I = 10.28/0.540 = 19.04. This value lies far beyond even
the 0.1% level (5.405) in table A 4. We might report: "P < 0.001."
In Case I, 1= 0.28/0.540 = 0.519. From table A 4, an absolute
value of 1= 0.711 is exceeded 50% of the time in sampling from a popula-
tion with I'D = O. The test provides no evidence on which to rejeci the
null hypothesis in Case I. To sum up, the tests confirm the judgment of
the preliminary inspection in all three cases.
Confidence inlerval. From the formula given in section 2.16, the 95%
confidence interval for I'D is
D ± 10.0,slI = 15 ± (2.365)(0.540) = 15 ± 1.28
In the simulated example the limits are as follows.
Case I : - 1.00 to 1.56
Case II : 9.00 to 11.56
Case 111: 0.00 to 2.56
As always happens, the 95% confidence limits agree with the verdict given
by the 5% tests of significance. Either technique may be used.
4.3-Example of a paired experiment. The preceding examples il-
lustrate the assumptions and formulas used in the analysis ofa paired set of
data, but do not bring out the purpose of the pairing. Youden and Beale
(I) wished to find out if two preparations of a virus would produce dif-
ferent effects on tobacco plants. The method employed was to rub half a
leaf of a tobacco plant with cheesecloth soaked in one preparation of the
virus extract, then to rub the second half similarly with the second extract.
The measuremCjlt of potency was the number of local lesions appearing
on the half leaf: these lesions appear as small dark rings that are easily
counted. The data in table 4.3.1 are taken from leaf number 2 on each
of8 plants. The steps in the analysis are exactly the same as in the preced-
ing. We have, however, presented the deviations of the differences from
their mean, d, = D, - D, and obtained the sum of squares of deviations
directly instead of by the shortcut formula.
For a test of the null hypothesis that the two preparations produce on
the average the same number of lesions, we compute
D 4
df = n - I = 7
I = ~ = --., = 2.63,
.I D 1.5_
From table A 4, the significance probability is about 0.04, and the null
hypothesis is re.jected. We conclude that in the popUlation the second
preparation produces fewer lesions than the first. From this result we
95
TABLE 4.3.1
NUM9Ell m LESIONS 0 .... HJ.l.VES Of EIGHT TOBACCO LEAVES·

Prepara- Prepaea- i Squared


tion I tion 2
I D=X1-X
Difference Deviation Deviation
Pair No. X, X,
-
1 d~D-lJ
. __-_
d'

1 31 18 13 9 81
2 20 17 3 -I 1
3 18 14 4 0 O·
4 17 II 6 2 4
5 9 10 -I -5 25
6 8 7 I -3. 9
7 10 5 5 I I
8 7 6 I -3 9

Total 120 88 32 0 130

Mean 15 II lJ~4 SDl = lSJi1


I
3D'). = 18.57/8 = 2.32. Slf = 1.52 lesions
• Slightly changed to mak.e calculation easier

would expect that both the 95';-~ confidence limits for iJ.D will be positive.
Since 10.osSo = (2.365)(1.52) = 3.69, the 95% limits are +0.4 and + 7.6
lesions per leaf.
In this experiment the leaf constitutes the pair. This chOIce was
made as a result of earlier studies in which a single preparation was rubbed
on a large number of leaves, the lesions fOlrnd on each half-leaf being
counted. In a new type of work, a preliminary study of this kind can be
highly useful. Since every half-leaf was treated in the same way, the varia-
tions found in the numbers of lesions per half leaf represent the natural
variability of the experimental material. From the data, the investigator
can estimate the population standard deviation, from which he can in
turn estimate the size of sample needed to ensure a specified degree of pre-
cision in the sample averages. He can also look for a good method of
forming pairs. Such a study is sometimes called a uniformity trial,' be-
cause the treatment is uniform, although a variability trial might be a
better name. .. . .
Youden and Beale found that the two halves of the same leaf were
good partners, since they tended to give similar numbers of lesions. An
indication of this fact is evident in table 4.3.1, where the pairs are arranged
in descending order of total numbers of lesions per leaf. Notice that with
two minor exceptions, this descending order shows up in each preparation.
If one member of a pair is high, so is the other: if one is low, so is the other.
The numbers on the two halves of a leaf are said to be positively correlated.
Because of this correlation, the differences between the two halves tend
to be mostly small, and therefore less likely to mask or conceal an im-
posed difference due to a difference in treatments.
96 Chapt.r 4: n... Comparison af Twa Sampl••
EXAMPLE 4.3.1---L C. Grove (2) determined the sample mean numbers of florets
produced by seven paIrS of plots of Excellence gladiolus. one plot of each pair planted
with high (first-year) corms, the other with low (second-year or older) corms. (A corm IS
an underground propagatmg stem.) The plot mean~ were as follows:

Corm Florets

High 11.2 13.3 12.8 13.7 12.2 11.9 12.1


Low 14.6 12.6 15.0 15.6 12.7 12.0 13.1

Calculate the sample mean difference." ADS. 1.2 florets. In the-population of such differ-
ences. tell the null hypothesis: JJD = O. Ans. P = 0.06, approximately.
EXAMPLE 4.3.2--Samples of blood were taken from each of 8 patients. In each sam-
ple. the serum albumen content of the blood was detennined by each of two laborator~
methods A and B. The objective was to disco\-'er whether there was a consistent ,difference
in the amount of serum albumen found by the two methods. The 8 differences (A-B) were
as follows: 0.6, 0.7, 0.8, 0.9, 0.3, 0.5, -0.5, 1.3. the units being gm. per 100 ml. Compute
I to test the I;1UII hypothesis (Ho) that the population mean of these differences Is zero. and
report the approximate value of your signifi ance probability. What is the conclusion?
Ans. I = 2.511, with 7 d.f P between 0.05 and 0.025. Method A has a systematic tendency to
give higher values.
EXAMPJ E 4.3.3--Mitchell, Burroughs, and Beadles (3) computed the biological
values of proteins from raw peanuts (P) and roasted peanuts (R) as determined in an experi.
ment with 10 pairs of rats. The pairs of data P, R are as follows: 61. 55; 60.54; 56. 47;
63.59; 56. 51; 63, 61; 59. 57: 56. 54: 44.63: 61. 58. Compute the sample mean difference,
2.0. and the sample standard deviation of the differences. 7.72 units. Since I"", 0.82.
over 40° " of similar samples from a population with I'D = 0 would be expected to have
larger (·values.
Note: 9 of the 10 differences, P - R. are positive. One 'o\'ould like some information
about the nexHo~the·last pair 44. 63. The first member seems abnormal. While unusual
individuals lik.e this do occur in the most carefully conducted trials. their appearance de·
\l\a:n<!'S. immedi.ate itwestip.tio\\. Doubt\ess a~ ettot It\ tecotdinog. Ot ,,:omputatiot\ wa'S.
searched for but not found. What should be done about such aberrant observatIOns is a
moot question: their occurrence detracts from one's confidence in the experiment.
EXAMPLE 4.3.~~A man starting work in a new town has two routes A and B by which
he may drive home. '"He conducts an experiment to find out which route is quicker. Sin..·e
traffic is unusua\\y heavy on Mondays and Fridays but does nOl seem to vaT)' much from
week to week. he selects the day of the week as the basis for pairing. The test lasts four weds,
On the first Monday, he tosses a coin to decide whether to drive by rOllte A or 8, On Ihl'
second Monday. he drives by the other route. On the third Monday. he again tosses a coin.
using the other route on the fourth Monday, and similarly for the other days of the week
The times taken. in minutes. were as follows:

Ml M2 Tul Tu2 WI W2 Thl T Fl Fl


A 28.7· 26.2 24.8 2$.3 25.1 <3.9 ·26.1 25.8 30.3 31.4
B 25.4 25.8 24.9 25.0 23.9 23.3 26.6 24.8 28.8 30.3

(i) Treating the data as consisting of JO pairs, test wh'!ther there seems to be any real differ·
ence in average driVing times between A and B. (ii) Compute 95% confidence !tmits for the
population mean difference. What would you regard as the population in thi~ trial'_' (iii)
By eye inspection of the results, does the pairing look effective? (iv) Suppose that on the
last Friday (F2) there had been a fire on route B. so that the time taken to get home was 48
97
minutes.' Would you recommend rejecting this pair from the analysis? Give your reason.
Ans. (i) 1= 2.651, with-9 df. P about 0.03. Method B seems definitely quicker, (ii) 0.12 to
1.63 mins. There really isn't much difference. (iii) Highly effective.

4.4-Conditioos for pairing. The objective of pairing is to increase


the precision of the comparison of the two procedures. Identical twins
are natural pairs. Litter mates of the same sex are often paired success-
fully, because they usually behave more nearly alike than do animals less
closely related. If the measurement at the end of the experiment is the
subject's ability to perform some task (e.g., to do well in an exam), sub-
jects similar in natural ability and previous training for this task should
be paired. Often the subjects are tested at the beginning of the trial to
provide information for forming pairs. Similarly, in experiments that
compare two methods of treating sick persons, patients whose prognosis
appears about the same at the beginning of the rial should be paired if
feasible.
The variable on which we pair should predict l!ccurately the per-
formance of the subjects on the measurement by which the effects of the
treatments ar.e to be judged. Little will be gained by pairing students on
their I.Q.'s ifl.Q. is not closely related to ability to perform the particular
task that is being' measured in the experiment.
Self-pairing is highly effective when an individual's performance is
consistent on different occasions, but yet exhibits wide variation when
comparisons are made from one individual to another. If two methods
of conducting a chemical extraction are being compared, tlie pair is likely
to be a sample of the original raw material which is thoroughly mixed
and divided into two parts.
Env.irqnmental variation often calls for pairing. Two treatments
should be laid down side by side in the field or on the greenhouse bench
in order to avoid the effects of unnecessary differences in soil, moisture,
temperature, etc. Two plots or pots next to each other usually respond
more nearly alike than do those at a djstance. As a final illustration,
sometimes the measuring process is lengthy and at least partly subjective,
as in certain psychiatric studies. If several judges must be used to make
the measurements for comparing two treatments A and B, each scoring a
different group of patients, an obvious precaution is to ensure that each
judge scores as many A patients as B patients. Even if tne patients were
not originally paired, they could be paired for a;signment to judges.
Before an experiment has been conducted, it is of course not possible
to foretell how effective a proposed pairing will be inIncreasing precision.
However, from the results of a paired experiment, its precision may be
compared with that of tJi.e corresponding unpaired experiment (section
4.11 ).

4.S-Tests of olber null hypotheses ahout 1'. The null hypothesis'


I'D = 0 is not the. only one that is useful, and the ulternative may be I'D > 0
instead of I'D '" O. Illustrations are found in a Boone County survey of
98 Chapt~r 4: Th. Compcrilon of Two Sampl...
corn borer effects. On 14 farms, the effect of spraying was evaluated by
measuring the corn yield from both sprayed and unsprayed strips in each
field. The data are recorded in table 4.5.1. The sample mean difference
is 4.7 bu./acre with SD = 6.48 bu./acre and Sn = 6.48/ J 14 = I. 73 bu./acre.
A one-tailed /-lest. It had already been established that the spray, at
the concentration used, could not decrease yield. If there is a decrease, as
in the first field, it must be attributed to causes other than the spray, or to
sampling variation. Consequently if I'D is not zero then it must be greater

TABLE' 4.5.1
YIELDS Of CORN (BUSHELS PER ACRETIN SPRAYED ,.ND UNSPRAYED STRIPS OF 14 FIELDS
Boone County,lowa. 1950

Sprayed 64.3 78.1 93.0 80.7 89.0 79.9 90.6 102.4


Unsprayed 70.0 74.4 86.6 79.2 84.7 75.1 87.3 98.8

Difference -5.7 3.7 6.4 .1.5 4.3 4.8 3.1 3.6

70.7 106.1 107.4 74.0 72.6 69.5


70.2 101.1 83.4 65.2 68.1 68.4

0.5 5.0 24.0 8.8 4.5 U


than zero. The objective of this experiment was to test Ho : I'D ,: 0 with·
HA /-ID > O. As before,

.4.7 - 0 .
I= 3 = 2.72, df = 13
1.7

To make a one-failed leSI with filble A 4, ./ocate the sample ,.alue of t


and use half of the probabililY indicated.
Applying this nlle to the t = 2.72 above, Pis .,ight'y less than 0.02/2;
the null hypothesis is rejected at P < 0.01. Evidently spraying did decrease
corn borer damage, resulting in increased yields in BOQne Couniy in 1950.
Test of a non-zero 1'. This same Boone County experiment may be
cited to illustrate the use of a null hypothesis different from /-ID = O. This
experiment might have had as its objective the test of lhe null hypothesis,
"The cost of spraying is eq'lal to the gain from increased yield." To
evaluate costs, the fee of commercial sprayers was $3 per acre and the
1950 crop was sold at about $1.50 per bushel. So 2 bushels per acre would
pay for the spraying .. This test would be Ho: I'D = 2 bu:;acre. H. : I'D ~ 2
bU./acre, resulting in .
4.7 - 2.0
1= - - = 1.56, df = 13
L73 .
The two-tailed probability is about P = 0.15, and the ·null hypothesis
would presumably nO.t be rejected. The verdict of the test is inconclusive:
"
it provides no strong evidence that the farmers will either gain or lose by
spraying.
One-tailed test ofa non-zero 1'. It is possible that Ho : I'D = 2 bu,jacre
might be tested withH. : I'D > 2 bu./acre; that is, thealtemative hypothesis
might be put in the form of a slogan, "It pays to spray." If this weredone,
1= 1.56 would be associated with P = 0.15/2 = 0.075, not significant.
But the implication of this one-sided test is that Ho would be accepted
no matter how far the sample mean might fall short of 2 bu./acre. It is
the two tailed test which is appropriate here.
This point is stressed for the reason that some people use the one-
sided test because, as a man said, "I am not interested in the other alterna-
tive." A one-tailed test of Ho: I'D = 1'0 against H.: I'D ,.. 1'0 should ~
used only if we know enough about the nature of the process being studied
to be certain that I'D could not be less than 1'0'
In considering the profitability of spraying, it is more informative to
treat the statistical problem as one of estimation than as one of testing
hypotheses. Since the mean difference in yield between sprayed and un-
sprayed strips is 4.7 bu. per acre. the sample estimate of the profit per acre
due to spraying is 2.7 bu. We can compute confidence limits for the
average profit per acre over a population of fields of which this is a random
sample. For 90% limits we add and subtractto.1oSD = (1.771)(1.73) = 3.1
bu. Thus if the farmers are willing to take a 1-in-1 0 chance that the sample
estimate was not exceptionally poor, they learn that the average profit per
acre lies somewhere between -0.4 bu. and + 5.8 bu. These !imits are
unfortunately rather wide for a practical decision: a larger sample size
would be necessary to narrow the limits. They do indicate, however, that
although there is the possibility of a small loss, there is also the possibility
of a substantial profit. The 95% limits, - 1.0 bu. and + 6.4 bu., tell much
the same slory.
EXAMPLE 4.5.1-]D an investigation of the effect of feeding 10 meg. of vitamin 8 12
per pound of ratioD.to growing swine (4), g lots (each with 6 pigs) were fed in pairs. The
pai.rs were di.stinguished by being _fed different levels of aureomycin, an antibiotic whicb
did nOl interact with the vitamin; that is, the differences were not affected by the aureomycin.
The average daily gains (to about 200 Ibs. live weight) are summarized as follows:

Pairs of Lots

Ration 2 3 4 ~ 1> 7 S

With 8 12 1.60 1.68 I. 75 1.64. 1.75 1.79 1.78 1.77


Without Btl 1.~6 1.52 1.~2 1.49 1.59 1.56, 1.60 1.56

Difference, D 0.04 0.16 0.23 0.15 0.16 0.23 0.18 0.21

For the differences, calculak the statistics. D = 0.170 Jb..jday aDd 'lJ - O.0217Ib./day.

7
100 C/tapIer 4: "'" ~ 01 Two Samples
EXAMPLE 4.5.2-1t is known that the addition of small amounts of the vitamin Clln-
not decrease the rate: of growth. While it is fairly obvious that [j will be found significantly
different from zero, the ditlerences being atl positive and. with one exception, fairly con-
sistent, you may be interested. in evaluating I. ADS. 7.83, rar beyond the 0.01 level in the
table. The appropriate alternative hypothesis is J.I > O.
EXAMPLE 4.S.l-The effect 0[8 11 seems to be a stimulation of the metabolic processes
including appetite. The pigs eat more and grow faster. In the experiment above, the cost
of the additional amount of feed eaten, including that of.the vitamin. corresponded to about
0.130 lb./day of gain. Test the hypothesis that the'profit derived from feeding Bil is zero.
Ans. t = 1.8.4. P = 0.11 (two-sided alternative).

4.6-CompariloD of the means of two illdependent samples. When no


pairing has been employed, we have two independent sample. with means
X" X,; which are estimates of their respective population means 1'" 1'"
Tests of significance and confidence ~ntervals concerning the population
difference. 1'. -1',. are again based on the t-distribution. where tnow bas
the value

It is assumed that X. and X, are normally distributed and are independent.


By theory. their difference is also normally distributed. so that the
Rumerator of t is normal with mean zero.
The denominator of t is a sample estimate of the standard error of
(X, - X,). The background for this estimate is given in the next two sec-
tions. First. we need an important new result for the population variance
ofa difference between any two variables XI and X,.
aXt- x / =: O'xl
l
+ (Tx./
The variaRce of a difference is the sum of the variances. This result holds
for any two variables, whether normal or not, provided tbey are inde-
pendently distributed.

4.7-1be variaace of. dilfereace. A popUlation variance is defined


(section 2.12) as the average, over the population. of the squared devia-
tions from the pppulation mean. Thus we may write
t1 x.-x/ = Avg. of {(X, - X,) - (p. -I'l)}'
But.
(X, - X 2) - (p, - 1',) = (X. - 1',) - (X, - 1',)
Hence. on squaring and expanding.
{(X, - X,) - (p, - I',)}' = (X, -1'.)' + (X, - 1',)'
- 2(X I -I',)(X, -1'1)
Now average over all pairs of values X" X, that can be drawn from
their respective pOpulations. By the definition of a population variance,
101
Avg. of (X, - 1',)' = t7x:
Avg. of (X. - 1'.)' = t7 x :
This leads to the general result
t7 x ,-x,' = t7 x : + t7x,' - 2 Avg. of (X, -I',)(X. -1'.) (4.7.1)
At this point we use the fact that X, and X. are independently drawn.
Because of this independence, any specific value of X, will appear with
all the values of X. that can be drawn from its population. Hence, for
this specific value of X"
Avg. of (X, - I',)(X. -1'.) = (X, -I',){Avg. of (X. -I'.)}
=0
since 1'2 is the mean or average of all the values of X.. It fo1lows that the
overall average of the cross-product term (X, -I',)(X. -1'.) is zero, so
that
(4.7.2)

Apply this result to two means X" X., drawn from populations with
variance a'. With samples of size n, each mean has variance t7'/n. This
gives
l1xl_x/' = 20"2/n
The variance of a difference is twice the variance of an individual mean.
If 17 is known, the preceding results provide the material for tests and
confidence·intervals concerning 1', - 1' •. To illustrate, from the table of
pig gains (table 3.2.1) which we used to simulate a normal distribution
with f1 = 10 pounds, the first two samples drawn gave X, = 35.6 and
X;. = 29.3 pounds, with n = 10. Since the standard error of X, - X. is
.,,;2t7/Jn, the quantity .
Z = In{(X, - X.) - (P, - 1'.)}/J2t7
is a normal deviate. To test the null hypothesis that 1', = 1'. we compute
Z = In(X, - X.) = JTO(6.3) = 19.92 = 1.4J
J2 f1 .,,;'!(10) 14.14
From table A 3 a larger value of Z, ignoring sign, occurs about 16% ofthe
trials. As we would expect, the difference is not significant. The 95%
confidence limits for (I', - 1'.) are '.
(X, - X.) ± (1.96)J2 t7/Jn
4.8-A pooled estimate of ••riam:e. In most applications the value
of 0'1 is not known. However, each sample furnishes an estimate of ,,2 :
call these estimates $,'
and •• '. With samples of the same size n, the best
combined estimate is their pooled average .' = (s,' + $. ')/2. .
102 C"'_' 4, no. CGmparioon 01 T_ s.m,.I••
Since $,' = I:x/I(n - I) and s/ = 1:x,z/(n - I), where, as usual,
x, = X, - X, and X2 = X2 - X,. we may write
2 I:XJ 2 + l;X2 2
s = -----:2':-(n----:-l:--')'-

This formula is r""ommended for routine computing since it is quicker


and extends easily to samples of unequal sizes.
The number of degrees of freedom in the pooled S2 is 2(n - I), the
sum of the df. in s,.
and s;.
This leads to the result that

I'" In{{X , - X2) - (p, - 1l,»)IJ2 s


follows Student's t-distribution with 2(n - I) df.
The prc:cedinll analysis requires one additional assumption, namely
that (1 is the same in the two populaJions. The situations in which this
assumption is suspect and the comparison of X, and X, when the assump-
tion does not hold are discussed in sc:ction 4.14.
It is now time to apply these methods to a real experiment.
4,9-An experiment comparing two groups of eqaal size, Breneman
(5) compared the IS-day mean comb weights of two lots of male chicks,
One receiving sex horll)one A (testosterone), the other C (dehydro-
androsterone). Day-old chicks, II in number, were assigned at random
to each of the treatments. To distinguish between. the two lots, which
were caged together. the heads of the chicks were stained red and purple.
respectively. The individual comb weights are recorded in table 4.9.1.
The calculations for the test of Significance lire given at the foot of
the table. Note that in the Hormone A sample the correction term
(1:%)2/n is (1,067)2/11 = 103,499. Note also the method recommended
for computing the pooled S2. With 20 df., the value of t is significant at
the I % level. Hormone A gives higher average comb weights than
hormone C. ~ two sums .of squares of deviations, 8,472 and 7,748,
make the assumption of equal (1' appear reasonable.
The 95% confidence limits for (p, - Ill) are

x, - X, ± 1•.• ,Sr,_r,
or, in this example,
41 - (2.086)(12.1) = 16 mg., and 41 + (2.086)(12.1) = 66 mg.
EXAMPLE 4.9.1-Lots of 10 bees were fed. two concentrations of syrup, 2€r'';' and
65%. at a feeder half a mile from the bive (6). Upon arrival at the hive their hooey sac;s
were removed and the concentration of the ftuid measured. In every case there Was a de-
crease from the feeder concentration. The dten:ases were: from the 21)010 syrup, 0.7.0.5.0.4.
0.7.0.5,0.4.0.7,0.4.0.2. and 0.5; from (be 65% syruP. 1.7.2.8.2.2.1.4,1.3.2.1, O.S. 3.4,
1.9, and 1.4%. Here. every observation in the second sample is LvI« than any ill. the first,
$0 that rather obviOUlly p, < Pl' Show that I - 5.6 if PI - lit ... O. 1bere is Jittle doubt
103
TABLE " .9. 1
TI!STING TME DIFlEJtENCE B£TWDN THE MEANS OF Two II'IDIPEND£NT SAMPLII

Waght of Comb (mas.)

Hormone Hormone
A C
57 89
120 30
101 82
137 SO
119 39
117 22
1()4 57
73 32
53 96
68 )1
118 88

Totals 1.067 616

II II
97 56
111 .971 42.244
103.499 3....96
l:x J 8.472 7.748
df. 10 10
8.472 + 7. ~
--- - 811 df. .,. 20
10 + 10 .

lX, - X, = - ..
/(811)
- _ .... 12.14 mg.
II
,= (!. - !z)/Jr,- J, - 41 /12.14 '" 3.38

that. under the experimentJIl conditions imJlO$ed. tbe concentration durin8 RighI dccreases
more with the 65% syrup. But how about equality of variances ? Set secuons 4. 14 and
4. IS for further discussion.

EXAMPLE 4.9.2-Four determlDations of the pH of Shelby 101m were made with


eacb of two types of &Jus electrode (7). With a modified quinhydrone electrode. the read-
iDp _ e 5.78. 5.74. 5.14, and 5.80: while with modified AI/4C1 electrode. they were
5.82. S.87, 5.96, and S.89. With the hypothesil tbat 1', - 1'1 - O. c:aIc1IIate I - 2.66. Note :
if you subtrKt 5.74 from every observation. tbe calculations are simpler.

EXAMPLE 4.9.1--lu experiments to rneuure the effectiveness 0( carbon tetrachloride


u a worm-killer. each of 10 flIU received an injcction of SOO larvae of the worm. lfippo-
J/'fHfIYbu """is. Eight days later S of the tats, cholen at random. each received 0.126 « .
of a solution ofcarbon tetrachloride. and two daY' later tbe rau were killed and the ftumbcn
of adult worms counted. These numhcn ~ 378. 275. 412. 26S, &lid 286 for the control
rau and 123, 143, 192. 40. and 259 for the ralS treated wilh CCI.. Findlhe Iipi6cance
probabilil) for tbe dift'erencc in mean numba'5 of worms. and (:Omputc 95% confideN:e
limits for this difference. Ans. I = 3.64 with 8 df. P close to 0.01. Confidence limits are
63 and 280.
EXAMPLE 4.9.4--Fifteen kernels of mature lodent corn were tested for crusbing
resistance. Measured in pounds the resistances were: 50, 36. 34, 45, 56, 42, 53, 25, 65,
33,40,42.39,43.42. Another batch of 15 kernels was tested after being harvested in the
dough stage: 43, 44, 51,40,29,49, 39, 59, 43, 48, 67, 44, 46:54, 64. Test the significance
of the difference between the two means. Ans. I = 1.38.
EXAMPLE 4.9.5--ln reading reports of researches it is sometimes desirable to supply
a test of significance which was not considered necessary by the author. As an elUUllple,
Smith (S) gave the sample mean yields and their standard errors for two crosses of maize
as S.84 ± 0.39 and 7.00 ± O.IS grams. Each mean was the average of five replications.
Determine if the mean difference is significant. Ans. I = 4.29, df. = 8. P < 0.5%. To do
this in the quickest way, satisfy yourself that the estimate of the variance of the difference
between the two means is the sum of the squares of 0.39 and 0.18, namely 0,1845.

4.10-Groups of unequal sizes. Unequal numbers are common in


comparisons made from survey data as, for example, comparing the mean
incomes of men of similar ages who have master's and bachelor's degrees,
or the severity of injury suffered in auto accidents by drivers wearing seat
belts and drivers not wearing seat belts. In planned experiments, equal
numbers are preferable, being simpler to analyze and more efficient, but
equality is sometimes impossible or inconvenient to attain. Two lots of
chicks from two batches of eggs treated differently nearly always differ in
the number of birds hatched. Occasionally, when a new treatment is in
short supply, an experiment with unequal numbers is set up deliberately.
Unequal numbers occur also in experiments because of accidents and
losses during the course of the trial. In such cases the investigator should
always consider whether any loss represents a failure of the treatment
rather than an accident that is not to be blamed on the treatment. Need-
less to say, such situations require careful judgment.
The statistical analysis for groups of unequal sizes follows almost
exactly the same pattern as that for groups of equal sizes. As before, we
assume that the variance is the same in both populations unless otherwise
indicated. With samples ofs~nl' n2' their means XI and X2 have vari-
ances (12/nl and 0 2/ n2 . The variance of the difference is then

In order to form a pooled estimate of (12, we follow the rule given for
equal-sized samples. Add the sums of squares of deviations in the numer-
ators of S,2 and s/. and divide by the sum of their degrees of freedom.
These degrees of freedom are (nl - I) and (n2 - I), so that the de-
nominator of the pooled S2 is (nl +"2 - 2). This quantity is also the
number of d,( in the pooled S2. The procedure will be clear from the
example in table 4.10.1. Note how closely the calculations follow those
given in table 4.9.1 for samples of equal sizes.
105
TABLE 4.10.1
ANALYSIS FOR Two SAMPll:S Of UNEQUA.l SIZES.. q"INS IN WEIGHTS OF Two loTS
OF FEMALE RATS (28-84 days old) UNDER Two DIETS

Gains (gms.)

High Protein low Protein

134 70
146 118
104 101
119 85
124 107
161 132
107 94
83
113
129
97
123

Totals 1440 707

12 7
"means 120 101
IXl 177.832 73.959
(H)'/" 172.800 71.407

5.032 2.552
II 6
5.032 + 2.552
Pooled 52 446.l:!,
11 +6

sl,_I, = J ("' n,)


S' -+- = J{(446.l2)(19)j84} = 10.04 gms.
n1nl
t = 19/10.04 = 1.89, P about 0.08.

The high protein diet showed a slightly greater mean gain. Since P
is about 0.08, however, a difference as large as the observed one would
occur about 1 in 12 times by chance, sn that the observed difference can-
not be regarded as established by the usual standards in tests of sig-
nificance.
For evidence about homogeneity of variance in.the two populations.
observe that 5,' = 5.032111 = 457 and 5,' = 2,552/6;0, 425.
If the investigator is more interested in estimates than in tests. he may
prefer the confidence interval. He reports an observed difference of 19
gms. in favor of the high protein diet. with 95~o confidence limits - 2.2
and 40.2 gms.
EXAMPLE 4.10.1-The following are the rates of diffusion of carb~1O diOXide through
two SOils of different porosity (9). Through a fine soilln: 20. :_\ 1. 18. 23, 23, .""!8, 23, 26. 27,
26, 12, 17, 25: through a coarse soil (c): 19, 30, 32, 28. 15, :!6. 35, ! S, 25, 27, 35. 34. Show
that pooled Sl,.. 35.113. s:r,-rl '" 2.40, d.f. = 23, and t - 1.67. The difference, therefore,
is Dot significant.
EXAMPLE 4.10.2-The total nitrogen content of the blood plasma of normal albino
rats was measured at 37 and ISO days of age (10). The results are ex.pressed as gms. per 100
ce.ofplasma. At ale 37 days. 9 rats had 0.98, 0.83, 0.99, 0.86. 0.90, 0.81. 0.94, 0.92, and
0.87; at age 180 days. 8 rats bad 1.20. 1.18. 1.33, 1.21, 1.20, 1.07, 1.13, and 1.12 gms. per 100
cc. Since significance is obvious. set a 95% confidence interval on the population mean
difference. Ans. 0.21 toO.35 JIllS./IOO ce.
EXAMPLE 4.10.3-Sometimes, especially in comparisons made from surveys, the two
samples are large. Time is saved by fanning frequency distributions and computing the
means and variances as in section 3.11. The following data from an ex.periment serve as an
illustration. The objective was to compare the effectiveness of two antibiotics, A and B, for
treating patients with lobar pneumonia. The numbers of patients were 59 and 43. The data
are the numbers of days needed to. bring the patient's temperature down to normal.

No. of Days I 2 3 4 S 6 7 8 9 10 Total

No. of A 17 8 S 9 7 I 2 I 2 7 S9
Patients B IS 8 8 S 3 I 0 0 0 3 43

What are your conclusions about the relative effectiveness of the two antibiotics in brinsina
down the fever? Ans. The difference of about 1 day in favor ofB has a Pvaiue between 0.05
and 0.025. Note that although these are frequency distributions. the only real grouping
is in the 10..d,ay groups, which actually represented "at least 10" and were arbitrarily rounded
to 10. Since the distributions are very skew, the analysis leans heavily on the Central Limit
Theorem. Do tbe variances Jiven by the two drugs appear to differ?
EXAMPLE 4.10.4---Show that if the two samples are ofaizes 6 and 12. the S.D. oftbe
difference in means is the same as when tbe samples are both of size S. Are tbe d.f. in tbe
pooled sl·the same?
EXAMPLE :4.IO.S-Sbow that the pooled S2 is a weighted mean of J 1 l and S]l in which
each is weishted by its number of df

",Il-Paired versus independent gro..... The formula for the vari-


ance of a difference throws more light on the circumstances in which
pairing is effective. Quoting formula (4.7.1),
(].,_.,' = (] • ."+ (]x,' - 2 Avg. of(Xt - I't)(X, - 1'2)
When pairing, we try to choose pairs such that if XI is high, so is X 2 .
Thus, if (XI - 1'1) is positive, so is (X2 - 1'2)' and their product
(XI - ,I'I)(X, - 1',) is positive. Similarly, in successful pairing, when
(XI - 1'1) is negative, (X2 - 1',) will usually also be negative. Their
product (XI - I't)(X, - 1'2), is again positive. For paired samples, then,
the average of this product is positive. This helps, because it makes the
variance of (XI - X,) less than the sum of their variances, sometimes very
much less. The average value of the product over the population is
called the covariance of XI and X" and is studied in chapter 7. The result
for the variance of a difference may now be written
aXI_xz
2
= a xl 2 + a x2 2 - 2 Cov. (Xh X2 )
101
Pairing is not always effective, because Xl and X, may be poorly
correlated. Fortunately, it is possible from the resuHs of a paired experi-
ment to estimate what the standard error of (X, - X,) would have been
if the experiment had heen conducted as two independent groups. By
this calculation the investigator can appraise the success of his pairing,
which guides him in decidin5 whether the pairing is worth continuing in
future experiments.
With paired samples of size n, the standard error of the mean dif-
ference 15 = Xl - X, is aD/.Jn, where aD is the standard deviation of the
population of paired differences (section 4.3). For an experiment with
two independent groups, the standard error of X I - X, is.J2 a/.Jn, where
u is the standard deviation ofth. original popUlation from which we drew
the sample of size 2n (section 4.7). Omitting the .In, the quantities that
we want to compare are aD and .J2·u. Usually, the comparison is made in
terms of variances: we compare (1D 2 with 20"2.
From the statistical analysis of the paired experiment, we have an
unbiased estimate so' of aD'. The problem is to obtain an estimate of
2u'. One possibility is to analyze the results of the paired experiment by
the method of section 4.9 for two independent samples, using the pooled
5' as an estimate of u'. This procedure gives a good approximation when
n is large, but is slightly wrong, because the two samples from which s'
was computed were not independent. An unbiased estimate of 2u' is
given by the formula

2/1' = 2s' - (2s' - SD')/(2n - I)


(The 'hat' [ 1placed above a population parameter is often used in mathe-
matical statistics to denote an estimate of that parameter.)
Let us apply this method to the paired experiment on virus lesions
(table 4.3.1, p.95), .which gave SD' = 18.57. You may verify that the
pooled s' is 45.714, giving 2s' = 91.43. Hence, an unbiased estimate of
2u 1 is
2a' = 91.43 - (91.43 - 18.57)/15 = 86.57

The pairing has given a much smaller variance of the mean difference,
18.57/n versus 86.57/n. What does this imply in practical terms? With
independent samples, the sample size would have to be increased from 8
pairs to 8{86.S7)/(I8.57), or about 37 pairs, in order to give the sam.
variance of the mean difference as does the paired expenmen!. The saving
in amount of work due to pairing is large in this case .. ,
The computation overlooks one point. In the paired experiment,
SD' has 7 df.. wheroas the pooled s' would have 14 dj. for error. The
I-value used in tests of significance or in computing confidence limits
would be slightly smaller with independent samples than with paired
samples. Several writers (11), (12), (13), have discussed the allowance
that should be made for this difference in number of dj. We suggest a
108 Chapter 4: The Comparison of Two Samp/.,
rule given by Fisher (12). Multiply the estimated variance by
(f + 3)/(/ + 1), where f is the d.f. that the experimental plan provides ..
Thus we compare
(18.57)(10)/8= 23.2, with (86.57)(I7)/( 15) = 98.1
D. R, Cox (13) suggests the multiplier (f + 1)'11'. This gives almost the
same results, imposing a slightly higher penalty whenfis small.
From a single experiment a comparison like the above is not very
precise, particularly if n is smalL The results of several paired experi-
ment~ in which the same criterion for pairing was employed give a more
accurate picture of the success of the pairing. If the criterion has no cor-
relation with the response variable, there is a small loss in accuracy from
pairing due to the adjustment for df. There may even be a substantial
loss in accuracy if the criterion is badly chosen so that members of a pair
are negatively correlated.
When analyzing the results of a comparison of two procedures. the
investigator must know whether his samples are paired or independent
and must use the appropriate analysis. Sometimes a worker with paired
data forgets this when it comes to analysis, and carries out the statistical
analysis as if the two samples were independent. This is a serious mistake
if the pairing has been effective. In the virus lesions example, he would
be using 2s'ln or 91.43/8 = 11.44 as the variance of 15 instead of
18.57/8 = 2.32. The mistake throws away all the advantage of the pair-
ing. Differences that are actually significant may be found non-significant
and confidence intervals will be too wide.
Analysis of independent samples as if they were paired seems to be
rare in practice. If the members of each sample are in essentially random
order, so that the pairs are a random selection, the computed SD' may be
shown to be an unbiased estimate of 2"'. Thus the analysis still provides
an unbiased estimate of the variance of \X J - X,) and a valid Hest.
There is a slight loss in sensitivity, since I-tests are based on (n - I) dl,
instead of 2(n - I) df
As regards assumptions, pairing has the advantage that its Hest does
nol require" I = (f,. "Random" pairing of independent samples has been
suggested as a means of obtaining tests and confidence limits when the
investigator knows that O't and (T2 are unequal.
Artificial pairing of the results, by arranging each sample in descend-
ing order and pairing the top two, the next two, and so on, produces a
great under-estimation of the true variance of 15. This effect may be
illustrated by the first two random samples of pig gains from table 3.3.1
(p.69). The population variance .,' is 100, giving 2.,' = 200. In table
4.11.1 this method of artificial pairing has been employed.
Instead of the correct value of 200 for 20'2 we get an estimate sv 2 of
only 8.0. Since SlJ = ,1(8.0/10) = 0.894, the I-value for testing fj is
I ~ 6.3/0.894 ~ 7.04. with 9 d.f. This gives a P value of much less than
0.1 ";';, although the two samples were drawn from the same population.
109
TABLE 4.11.1
Two SAMPLES OF 10 PIG GAINS ARRANGED IN DESCENDING OaDER, TO ILLUSTRATE
THE ERJlONEQUS CONCLUSIONS FROM ARTIFICIAL PAIRING

Sample I 57 53 39 39 36 34 33 29 24 12 Mean = 35.6


Sample 2 53 44 32 31 30 30 24 19 19 II Mean = 29.3

Dilr. 4 9 7 8 6 4 9 10 5 Mean= 6.3

Ed' ~ 469 ~ (63)'/10 = 72.1. so' ~ 72.1/9 = 8.0

EXAMPLE 4.11.I-ln planning experiments to test the effects of two pain-deadeners


on the ability of yaung men to tolerate pain from a narrow beam of light directed at the arm,
each subject was first rated several times as to the amount of heat energy that he bore with-
out complaining of discomfort. The subjects were then paired according to these initial
scores. In a later experiment the amounts of energy received at the point at which the sub-
ject complained were as follows. A and B denoting the treatments.

PaiT 2 3 4 5 6 7 8 9 Sums
A IS 2 4 1 5 7 1 o ~3 32
B 6 7 3 4 3 2 3 o ~6 22

To simplify calculations, 30 was subtracted from each original score. Show that for ap-
praising the effectiveness of the pairing, comparable variances are 22.5 for the paired experi·
ment and 44.6 for independent groups {after allowing for the difference in df.). The pre·
liminary work in rating the subjects reduced the number of subjects needed by almost one·
half.
EXAMPLE 4.11.2~In a previous ex.periment comparing two routes A and 8 for dln·ing
home from an office (ex.ampie 4.3.4). pairing. .....as by days of the week. The times taken
(-13 mim.) for the ten pairs were as follows:

A 5.7 3.2 1.8 2.3 2.1 0.9 3.1 2.8 7.3 8.4
B 2.4 B 1.9 2.0 0.9 0.3 3.6 1.8 5.8 7.3

Diff. 3.3 0.4 ~O.I 0.3 1.2 0.6 ~0.5 1.0 1.5 1.1

Show that if the ten nights on which route A was used had been drawn at random from the
twenty nights available. the variance of the mean difference would have been about 8 times
as high as with this pairing.
EXAMPLE 4.' 1.3....:...1f pairing has not re<iuced the variance. ~ "chat Spl = 2(1"2, show
that allowance for the error dj. by Fisher's rule makes pairing 15~/o less effective than inde-
pendent groups when n """ 5 and 9~~ less effecti've when n = 10. In small ex.periments,
pairing is inadvisable unless a sizeable reduction in variance is eXpe<:ted.

4.12-Precautions against bias-randomization. With either inde-


pendent or paired samples, the analysis assumes that the difference
(X, - X,) is an unbiased estimate of the population mean difference
between the two treatments. Un\ess precautions are taken when con-
ducting an experiment. (X, - X2 ) may be subject to a bias of unknown
110 Chapt.r 4: The Comparison of Two Sompl..
amount that makes the conclusion false. Corner (14) describes an ex-
ample in which, when picking rabbits ouLof a hatch, one worker tended
to pick large rabbits, another to pick small rabbits, although neither was
aware of his personal bias. If the rabbits for treatment A are picked out
first, a bias will be introduced if the final response depends on the weight
of the rabbit. If the animals receiving treatment A are kept in one cage
and those having B in another, temperature, draftiness, or sources of in-
fection in one cage may affect all the animals receiving A differently
from those receiving B. When the application of the treatment or the
measurement of response takes considerable time, unsuspected time trends
may be present, producing bias if all replicates of treatment A are pro-
cessed first. The investigator must be constantly on guard against such
sources of bias.
One helpful device, now commonly used, is randomization. When
pairs have been formed, the decision as to which member of a pair re-
ceives treatment A is made by tossing a coin or by using a table of random
numbers. If the random number drawn is odd, the first member of the
pair will receive treatment A. With 10 pairs, we draw 10 random digits
from table A I, say 9, 8, 0, 1,8,3,6,8,0, 3. In pairs 1,4,6, and 10, treat-
ment A is given to the first member of the pair and B to the second member.
In the remaining pairs, the first member receives B.
With independent samples, random numbers are used to divide the
2n subjects into two groups of n. Number the subjects in any order from
I to 2n. Proceed down a colul1ln of random numbers, allotting the sub-
ject to A if the number is odd. to B if even, continuing until n A's or n B's
have been allotted. With 14 subjects and the same random numbers as
above, subjects 1,4,6, and 10 receive A and subjects 2, 3. 5,7,8. and 9
.r....r.;"" .!I. J:luc< Jar Wi" .M"" ...llnlW Jonr .4'5"'00 "~iJ .II ~~. "" .that -'Dill"
random numbers must be drawn. The next two in the column are J, 8.
Subject II gets A and subject 12 gets B. Since seven B's have been as-
signed we 'tOp, giving A to subjects 13 and 14.
Randomization gives each treatment an equal chance of bemg al-
lotted to any subject that happens to give an unusually good or unusually
poor response, exactly as assumed in the theory of probability on which
the statistical analysis is based. Randomization does not guarantee to
balance out the natural differences between the members of a pair exactly.
With n pairs, there is a small probability, 1/2"-1, that one treatment will be
assigned to the superior member in every pair. With 10 pairs this prob-
ability is about 0.002. If the experimenter can predict which is likely to
be the superior member in each pair, he should try a more sophisticated
design (chapter II) that utilizes this information more effectively than
randomization. Randomization serves primarily to protect against
sources of bias that are unsuspected. Randomization can be used not
merely in the allocation of treatments to subjects, but at any later stage in
which it may be a safeguard against bias, as discussed in (II), (13).
Both independent and paired samples are much used m comparisons
lIT
made from surveys. The problem of avoiding misleading conclusions is
formidable with survey data (15). Suppose we tried to learn something
about the value of completing a high school education by comparing.
some years later. the incomes, job satisfaction. and general well-being of
a group of boys who completed high school with a group from the same
schools who started but did not finish. Obviously. significant differences
found between the sample means may be due to factors other than the
completion of high school in itself: differences in the natura! ability and
personal characteristics of the boys. in the parents economic level and
number of useful contacts. and so On. Pairing the subjects on their school
performance and parents' economic level helps, but no fa ndomization
within pairs is possible. and a significant mean difference may still be due
to ~xtraneous factors whose influence has been overl~oked.
Remember that a significant I-value is evidence that the popUlation
means differ. Popular accounts arc sometimes written as if a signifi-
cant I implies that every member of population I is superior to every
member of population 2. 'The oldest child in the family achieves more
in science or in business." In fact. the two populations may largely
overlap even though I is significant.

4.J3-Sample size in comparatil''' experiments. In planning an eXe


periment to compare two treatments. the following method is often used
to estimate the size of sample needed. The investigator first decides on a
value (j which represents the size of ditrerence between the true effect of
the treatments that he regards as important: If the lJ'ue difference is as
large as D. he would like the experiment to ..have a high probability of
showing a statistically significant difference between the tfeatment means.
Probabilities of 0.80 and 0.90 arc common. A higher probability, say
0.95 or 0.99, can be set, but the sanlpIe.size required to meet these severer
specifications is often too expensive.
This way of stating the aims in planning the sample size is particularly
appropriate when (ifthe treatments afe a standard treatment and a new
treatment that th~ experimenter hopes will be better than the standard.
and (ii) he intends to.discard the new treatment if (he experilllcn( does noc
show it to be significantly superior to the standard~ .,In these circum-
stances he does not mind dropping the new treatment if it ·is at most only
slightly better than the standar~. hilt he does not want to drop it. on the
evidence of the experiment. if it is substantially superior. The value of (~
measures his idea of a substantial true ditl"erence.
In order to make the calculation the experimenter supplies:
1. the value of 0,
2. the desired probability P' of obtaining a significant result if the
true difference- is (~t
3. the significance level ex of the test. which may be either one-tailed
or two-tailed. .
Cpnsider paired samples. Assume at first that (lD is known and that
112 c:Itapw 4: 1M CompcriIoft 01 Two San.,..
the test is two-tailed. In our specification, the observed mean dilference
fj = XI - Xl is normally distributed about (; with standard deviation
~oI..jn. This distribution is shown in figure 4.13. 1. which forms the
basis of our explanation. We have assumed {, > O.

z. CT /..[fi
O
8
8 - Z2(1_ p', CTO /.fii
Fki. 4. 13. 1- Frequcncy distributIOn of the mean difference D between t\lo'O treatments.

In order to be statisticall) significant. 1) must exceed Z2~", ...:n, where


Z. is the normal deviate corresponding to the two-tailed significance level
IX. (For IX = 0.01. 0.05, 0.10, the values of Zrare 2.576. 1.960, and 1.645.
respecli"ely. ) The vertical line 10 fi~ure 4. 13.) shows the Critical value.
In our specification. the probability that 1) exceeds this value must
be r . That is. this value divides the frequency distribution of fj into an
area r on the right and t 1 - r) on the left. Consider the standard
normal curvc. with mean 0 and S .D. I . With P.' > 112. the point at which
the area on the left is (I - P' ) is minus the normal deviate corresponding
to a on~-IQiled significance level (1 - P ' ). This is the same as minus the
normal deviate corresponding to a two-tailed significance level 2( I - P'),
or in our nota (ion to - Z211 _ r")' For instance, with P' = 0.9, this is the
normal deviate - Z O.l' or - 1.282.
Since l) has mean cS and S.D. ~o f .J", the quantity (1) - cS)/(~ Df J Il)
follows the standard normal curve. Hence. the value of fj that is exceeded
with probability P' is given by the equation
113
or,

It follows that our specification is satisfied if

A look at figure 4.13.1 may help at this point. Write fJ = 2(1 - P') and
solve for n,
(4.13.1)

To illustrate, for a one-tailed test at the 5% level with P' = 0.90, we


have,Z. = 1.645. Z, = 1.282. giving n = 8.6a o'/Ii'. Note that' 11 is the
size of each sample, the total number of observations being 2n.
Formula (4.13,1) for n remains the same for independent samples.
except that" 0 2 is replaced by 2,,'.
The two-tailed case involves a slight approximation. In a two-tailed
test, [j in figure 4.13,1 is also significant if it is less than - Z,uo/Jn.
But with ~ positive. the probability thai this happens is negligible in most
practical situations.
Table 4,13, I presents the multipliers (Z. + Z,)' that are most fre·
quently used,
When "D and" are estimated from the results of the experiment.
I-tests replace the normal deviate tests, The logical hasis of the argument
remains the same. but the formula for 11 becomes an integral equation in
calculus that must be solved by successive approximation. This equa-
tion was' given by Neyman (21) to whom this method of determining
sample size is due.
For practical purposes, the following approximation agrees well
enough with the values of 11 as found from Neymari's solution:
I. Find", to one decimal place by table 4.13.1.

TABLE 4,13.1
MULnPLtEas Of a lJ Z/ll IN PAIRED So\MPU!S, AND OF 2,,'l/d 1 IN'INpEPENOENT SAMPLIIS,
IN Oaon TO DETEIlMINE THE SIZE OF EACH SAMPLE

Two-tailed Tests One-tailed Tests

Level Level
p- O.oI O.OS 0.10 0.01 O.OS 0.10

0.80 11.7 7.9 6.2 10.0 6.2 4.5


0,90 14.9 10,S 8;6 13.0 S.6 M
0.95 17.S 13.0 10:8 IS,S to.8 8.6
II" Chapter '"~ The Comp.isoII of Two 5<Jmp/.s
2. Calculate J, the number of degrees of freedom supplied by an
experiment of this size (rounding n, upwards for this step).
3. Multiply n, in step I by (/ + 3)/(/ + I).
To illustrate, suppose that a 10"1. difference J is regarded as important
and that P' = 0.80 in a two-tailed 5°-;; test of significance. The samples
are to be independent, and past experience has shown that a is about 6%.
The mUltiplier for P' = 0.80 and a 5% two-tailed test in table 4.13.1 is 7.9.
Since2,,'/b' = 72/100 = O.72,n, = (7.9)(0.72) = 5.7. With a sample size
of 6 in each group,J = 10, Hence we take n = (13)(5,7)/11 = 6.7, which
we round up to 7, .
Note that the experimenter must still guess a value of a o or a,
Usually it is easier to guess a, If pairing is to be used but is expected to
be only moderately effective, take aD = ,)2 u, reducing this value if some-
thing more definite is known about the effectiveness of pairing. This un-
c~rtainty is the chief source of inaccuracy in the process.
The preceding method is designed to protect the investigator against
finding a non-significant result and consequently dropping a new treat-
ment that is actuaUy effective, because his experiment was too small. The
method is therefore most useful in the early stages of a line of work, At
later stages, when something has been learned about the sizes of dif-
ferences produced by new trealmenls, we may w;sh 10 spedfy Ihe ';ze of
the standard error or the half-width of the confidence interval that will be
attached to'an estimated difference.
For example, previous small experiments have indicated that a new
treatment gives an increase of around 20%, and a is around 7%, The in-
vestigator would like to estimate this increase, in his next experiment, with
a standard error of ± 2%. He sets ,)2(7)/,)n = 2, giving n = 25 in each
group, This type of rough calculation is often helpful in later work,
EXAMPLE 4.13.1-10 table 4.13.1. verify the multipliers given for a one-tailed test
= 0.90 and for a two-tailed test at the lOO/~ level with P' = 0.80.
at the 1% level with P'

EXAMPLE 4.13.2-10 planning a paired experiment. the investigator proposes to


use a one-tailed test of significance at the 5% level. and wants the. probability of finding a
significant difference to be 0.90 if (i) d = 100/0. (ij) c} = 5%. How many pairs does he need?
In each case, give the answer if (a) aD is known to be 12'}~. (b) tiD is guessed.as 12%. but a
Hest will be used in the experiment. Ans. (10) 13. (ib) 15. (iia) 50. (iib) 52.

EXAMPLE 4.13.3-10 the previous example. how many pairs would you guess to be
necessary if 6 == 2.5%':' The answer brin,s out the difficulty of detecting small differences
in comparative experiments with variable data.

EXAMPLE 4. 13.4-If tJD = 5. how many pairs are needed to make the half-width
of the 90°" confidence interva.l for the difference between the two population means = 22
Ans. n == 17.

4,14-Analysis of independent samples when (1, '" (1,. The ordinary


method of finding confidence limits and making tests of significance for
the difference between the means of two independent samples assumes that
115
the two population variances are the same. Common situations in which
the assumption is suspect are as follows:
(I) When the samples come from populations of different types, as
in comparisons made from survey data. In comparing the average values
of some characteristic of boys from public and private schools, we might
expect, from our knowledge of the differences in the two kinds of schools,
that the variances will not be the same.
(2) When computing confidence limits in cases in which the popula-
tion means are obviously widely different. The frequently found result
that" tends to change. although slowly. when /1 changes, would make us
hesitant to assume 0"1 = (12'
(3) With samples from populations that are markedly skew. In
many such populations the relation between" and /1 is often relatively
strong.
, When ", '" ",. the formula for the variance of (X, - X,) in inde-
pendent samples still holds, namely.

The two samples furnish unbiased estimates S1


2
of a/ and 52
2
of 0/
Consequently; the ordinary t is replaced by the quantity
t' = (X, - X,lIJ(-', '/11, + ','/11,)

This quantity does not fqllow Student's t-distribution when /1, = /1 ,. Two
different forms of the distribution of t , arising from different theoretical
backgrounds, have been worked out, one due to Behrens (16) and Fisher
(17), the other to Welch and Aspin (18). (22). Both require special tables.
given in the references. The tables differ relatively little. the Dehrens-
Fisher table being on the whole more conservative. in the sense Ihal slighlly
higher values of I' are required fpr significance. The following ap-
proximation due to Cochran (19), which uses Ihe ordinary Hable. is suffi-
cienlly accurale for our purposes. It is usually slighlly more conservalive
than the Behrens-Fisher solution.
Case I: '11 = n 2 . With"l ="2';;:::: n. the variance in the denominator
of t' is (s,' + S,')/II. But this is just 2-"/11. where -,' is Ihe pooled variance.
Thus, in this case, t' = I. The rule is: calculate I in Ihe usual way, but
give it (II - I) dJ: instead of 2(11 - I). "
Case 1: II, '" II" Calculate t', To find its significance 'level, look up
the significance levels of I in table A 4 for (n, - l) and (11,',- I) df Call
these values I, and I,. The significance level of t' is. approximately.
(Will + Wl/z)/(w t + wl)' where'J/ln" HI:! = s/Iu:
WI =

The following artificial examples illustrates th<calculatlOns: A quick


but imprecise method of estimating th~ concentration of a chemical in a
vat has been developed, Eight samples from the vat are an'alyzed. as well

8
116 Chapt.r 4: The Comparison of Two Sampl••
as four samples by the standard method, which is precise but slow. In
comparing the means we are examining whether the quick method gives
a systematic over- or underestimate. Table 4. J4.1 gives the computations.
TABLE 4.14.1
A TEST OF (Xl - X2 ) WHEN 0'1'" (fl'
CONCENlllA TION OF A CHE~ICAL BY Two MOHODS
======== =========
Standard Quick

25 23
24 18
25 22
26 28
17
25
19
16

XI~ 25 X, -21
nJ=- 4 n2 = 8
5\2== 0.67 s/ = 17.11
S11/1l1'= 0.17 Slljnl =- 2.21

,. - 41..12.38 - 2.60
I,() df.) }.182
~ 1,(7 df.) = 2.365
I"" ~ 5:, levtl of t' ~ 1(0.17)(3.182) + (h.21)(2.365)}/2.38
= 2.42

Since 2.60 > 2.42, the difference is significant .at the 5% level; the quick
method appears to underestimate.
Approximate 95% confidence limits for (I'I - 1',) are
XI -Xl±l'o,ossx,-X 1

or in this example, 4 ± (2.42)( 1.54) = 4 ± 3.7.


The ordinary Hest with a pooled s' gives t = 1.84, to which we would
erroneously attribute 10 df. The Hest tends to give too few significant
results when the larger sample has the larger variance, as in this example,
and too many when' the larger sample has the smaller variance.
Sometimes, when it seemed reasonable to assume that (J I = (12 or
when the investigator failed to think about the question in advance, he
notices that '1' and .,' are distinctly different. A test of the null hy-
pothesis that a, = a" given in the next section, is useful. If the null
hypothesis is rejected. the origin of the data should be re-examined. This
may reveal some cause for expecting the standard deviations to be dif~
ferent. [n case of doubt it is better to avoid the assumption that 0' I = a2'
4.1S-A test of the equality of two .arilUlces. The null hypothesis is
that S,' and .,' are independent random samples from normal popula-
tions with the same variance 0- 2 • In situations in which there is no prio_r
117
reason to anticipate inequality of variance. the alternative is a two-sided
one: <1, # <12' The test criterion is F=s'>ls/. where s,' is the larger
mean square. The distribution of F when the null h~thesis is true was
worked out by Fisher (20) early in the 1920·s. Like X and t it is one of the
basic distributions in modern statistical methods. A condensed two-
tailed table of the 5~~ significance levels of Fis table 4.15.1.

TABLE 4.15.1
50:'0 LEVEL (Two-TAILED) Of THE DISTRIBUTION Of F

f, - df for .Ii = df. for urger Mean Square


Smaller Mean
Square 2 4 6 8 10 12 15 20 30
---~ '"
2 39.00 39.25 39.33 39.37 39.40 39.41 39.43 39.45 39.46 39.50
3 16.04 15.10 14.74 14.54 14.42 14.34 14.25 14.17 14.08 13.90
4 10.65 9.60 9.20 8.98 8.84 8.75 8.66 8.56 8.46 8.26
5 8.43 7.39 6.98 6.76 6.62 6.52 6.43 U3 6.23 6.02
6 7.26 6.23 5.82 5.60 5.46 5.37 5.21 5.11 5.01 4.85

1 6.54 5.52 512 4.90 4.76 4.67 4.57 4.41 4.36 4.14
8 6.06 5.05 4.65 4.43 4.30 4.20 4.10 4.00 3.89 3.67.
9 5.71 4.72 4.32 4.10 3.96 3.87 3.77 3.67 3.56 3.33
10 5.46 4.47 4.07 3.85 3.72 3.62 3.52 3.42 3.31 3.08
12 5.10 4.12 3.73 3.51 3.37 3.28 3.18 3.07 2.96 2.72

15 4.76 3.80 3.41 3.20 3.06 2.96 2.86 2.76 2.64 2.40
20 4.46 3.51 3.13 2.91 2.77 2.68 2.57 2.46 2.3, 2.09
30 4.18 3.25 2.87 2.65 2.51 2.41 2.31 2.20 2.01 1.79
3.69 2.79 2.41 2.19 2.0S 1.94 1.83 I. 71 J.~7 1.00
'"
Use of the table is illustrated by the bee data in example 4.9.1 Bees
fed a 65% concentration of syrup showed a mean decrease in concentra-
tion of 1.9%. with s,' : 0.5"89. while bees fed a 2~~ concentration gave
a mean decrease of O.5~-;; with s,' = 0.027. Each mean square has 9 d/
Hence
F = 0.589/0.027 = 22.1
In the row for 9 df. and the column for 9 d,t: (interpolated between 8 and
10) the 5% level of F is 4.03. The null hypothesis is rejected. No clear
explanation of the discrepancy in variances was found, except that it may
reflect tbe association of a smaller variance with a smuller mean. The
difference between tbe means is strongly significant whether the variances
are assumed the same or not.
Often a one-tailed test is wanted, because we know, in advance of
seeing the data. which population will have the higher variance jf the null
hypothesis is untrue. The numerator of F is ,,' if <1, > <1, is the alterna-
tive, and s/ if <1, > <1, is the alternative. Table A 14 presents one-tailed
levels of Fdirectly.
118 CItaptw 4: The C...._no" of T_ Samplo.
EXAMPLE 4.IS.I-Young examined the basal metabolism of 26 college women
in two groups of 11\ = 15 and nl == 11; .II = 34.45 and Xl = 33.57 cal./sq. m.jhr.; l:Xll
= 69.36, :EX}1 = 13.66. Test Ho: at = 0'1- Ans. F= 3.62 to be compared with F o.o,
= 3.55. (Data from Ph,D. thesis. Iowa State University, 1940).
BASAL MET ABOLISM OF 26 COLLEGE WOMEN
(Calories per square meter per hour)
=======
7 or More Hours of Sleep 6 or Less Hours of Sleep

I. 35.3 9. 33.3 I. 32.5 7. 34.6


1. 35.9 10. 33.6 2. 34.0 8. 33.5
3. 37.2 II. 37.9 3. 34.4 9. 33.6
4. 33.0 12. 35.6 4. 31.8 10. 3LS
5. 31.9 13. 29.0 5. 35.0 II. 33.8
6. 33.7 14. 33.7 6. 34.6
7. 3!i.0 IS. 35.7 I.X120 369.3
8. 35.0 EX, = 516.8

'X I == 34,,45 cal/sq. m./hr. Xl = 33.57 cal./sq. m./hr.

EXAMPLE 4.15.2--ln the metabolism data there is little difference between the group
means, and the difference in variances can hardly reflect a cbrrelation between variance and
mean. It might arise from non-random sampling. since the subjects are volunteers, or it
could be due 10 chance. since Fis scarcely beyond the 5~1, level. As an exercise. test the differ-
ence between the means ii) without assuming 171 = 171' (ii) making this assumption. Ans.
(i)" - 1.31.1 o_os = 2.17. (ii) I = 1.19. to_os = 2.048. There is no difference in theconclu:
lions.

EXAMPLE 4.15.3-ln the preceding example, show that 95'l~ confidence limits for
Pl - Pl are -0.58 and 2.34 if we do not assume (J 1 = (11 •.and - 0.63 and 2.39 if this assump-
tion is made.

EXAMPLE 4.1 S.4-lf you wanted to test the Dull hypothesis (/t = (/1 from the data in
ta'b)ez..)z..). woui6:you use.
ODC-W)cO or a twa-taJ)cO test"?
REFERENCES
l. W. J. YoupENand H. P. BEALE. Contr. Boyce Thompson lnst .• 6.431(1934).
2. L. C. GROVE. Iowa Agric. Exp. Sta. 8ul., 253 (1939).
3. H. H. MITCHEll, W. BURR.OUGHS, and J. R. BEADUS. J. Nutrition, 11:257 (1936).
4. E. W. CRAMPTON. J. Nutrition. 7: 305 (1934).
5. W. R. BRENEMAN. Personal communication.
6. O. W. PARK. Iowa Agric. Exp. Sta. Sui., 151 (IQ32).
7 H. L.I>£ANand R. H. WAI.K .... J. A""r. Soc. Agron., 27:433(19JS).
8. S. N. SMITH. J. ~mer. Soc. AgTon., 26:192 (1934).
9. P. B. PEARSON and H. R. CATCHPOLE. Amer. J. Physiol .• 115:90(1936).
10. P. P. SWANSON and A. H. SMITH •• J. Bioi. Chern., 97: 745 (1932)
11. W. G. COCHRAN and G. M. Cox. Experimental Designs. Wiley, New York. 2nd ed.
(1957).
12. R. A. FlSHER. The Design of Experiments. 7th ed. Oliver and Boyd, Edinburgh (1960).
13. D. R. Cox. Plarming of £xperime1l1rs. Wiley, New York (1958).
14. G. W. CoaNER. The Hm'mones in Hunum Reproduction. Princeton Univenity Press
(1943).
IS. P. S. CHAPIN. EXp'rinrenlaJ'Designs in Sociological Reuarth. Harper. New York
(1947).
16. W. V. BEHRENS. Landwirl,H'hdliliche Jah,Nkhf'r. 68: 807 (I 929}. "'
17. R. A. FtSHEk and F. YATES. Stutisrit(J' Tables. 5th ed .. Tables VI. VII and Vl 2 OlivCf
and Boyd. Edinburgh 0957).
IS. A. A. AsprN. Biomelrika.36:290(1949),
19. W. G. COCHRAN. Biomt'trlcs. 20; 19J (1%4).
20. R. A. FISHER. Prot', bu ..Ualh. Omf Toronto. 805 (1924).
21. J. NFYMAN. K.lw,II,sKlrwlc2, and S,'. K.()LOD~JEj('2YK. J. R. Stathl. So(" 2: I 14\1935).
22. ,W. H. TRIcti.ETt". B. L WELCH, and G. S. JAMES. Biomelrika.43:203(19561.
* CHAPTER FIVE

Shortcut and non-parametric


methods

S.I-Introduction. In the preceding chapter you learned how to com-


pare the means of two samples: paired or independent. The present
chapter takes up several topics related to the same problem. For some
years there has been continued activity in developing rapid and easy
methods for dealing with samples from normal populations. In small
samples, we saw that the range, as a substitute for the sample standard
deviation, has remarkably high efficiency as compared to s. In section
5.2 a method will be described for comparing the means of two samples,
using the range in place of s. Often this test. which is quickly made. leads
to definite conclusions, so that there is no necessity to compute Student's
I. This range test may also be employed as a rough check when there is
doubt whether 1 has been computed correctly.
To this point the normal distribution has been taken as the source
of most of our sampling. Fortunately, the statistical methods described
are also effective for moderately anormal populations. But there is much
current interest in finding methods that w(lrk well for a wide variety of
populations. Such methods, sometimes called dislrihulion,tree methods.
are needed when sampling from populations that are far from normal.
They are useful <llso, particularly in exploratory research, when the in-
vestigator does not know much about the type of distribution being
sampled. The best-known procedures of this type are described in sec-
tions 5.3 to 5.7.
5.2-The t-test based on range. Lord (3) has developed an alternative
to the Hest in which the range replaces S¥ in the denominator of I. This
test is used in the same way as 1 for testing a hypothesis or making in-
terval estimates. Pillai (4) has shown that for interval estimates the effi-
ciency of this procedure relative to t stays above 95% in samples up to
n = 20. Like t. the range test assumes a normal distribution. It has
become popular, particularly in industrial work.
Table A 7 (i) applies to single samples or to a set of differences ob-
tained from paired samples. The entries are the values of (X - ,,)/...,
where w denotes the range of the sample. This r"tio will be called t w ,
since it plays the role of I.
120
121
For an illustration of the setting of confidence intervals by means of
Lord's table, we use the vitamin C data from chapter 2. The sample
values were 16,22,21,20,23,21,19,15,13,23,17,20,29,18,22,16,25,
with X = 20. We find w = 29 - 13 = 16 mg./IOO gm., with n = 17.
Table A 7 (i) has the entry 0.144 in the column headed 0.05 and the row
for n = 17. The probability that I'wl" 0.144 is 0.95 in random samples
of n = 17 from a normally distributed population. The 95% confidence
interval for Jl is fixed by the inequalities
X- lw~1-' ::; Jl :$ X+ I .... ""

Substituting the vitamin C data,


20 - (0.144)(16) :5 Ii " 20 + (0.144)(16)
17.7" Jl " 22.3 mg./IOO gm.
This is to be compared with the slightly narrower interval 17.95 "Ii'; 22.05
based on s.
The test ofa null hypothesis by means of tw is illustrated by the paired
samples in chapter 4 showing the numbers of lesions on the two halves
oftobacco leaves under two preparations of virus. The eight differences
between the halves were 13, 3,4,6, - I, I, 5, I. Here the mean difference
jj = 4, while w = 14 and n = 8. For the null hypothesis that the two
preparations produce on the average equal numbers of lesions,
jj 4
I = - = - = 0.286
w w 14 '

\Vhich is practically at the 5% level (0.288). The ordinary Hest gave a


significance probability of about 4~~.
Table A 7 (ii) applies to two independent samples of equal size. The
mean of the two ranges, W= (w, + w,)/2, replaces t'he K' of the preceding
paragraphs and X, - X, takes the place of jj.
The test of significance will be applied to the numbers of worms
found in two samples oftive rats, one sample lreated previously by a worm
killer.
TABLE 5.2_1
NUM8ER OF WORMS PER RAT

Treated Untreated

123 378
14J 275
192 412
40 265
259 2&6
Means, X 151.4 323.2

Ranges, J.I' . 219 147


122 Chop,.., 5: SMrlcul 0tHI Non-por_tric Met#tod.
We have X - X, ~ 171.8 and w= (219 + 147)/2 = 183. From this.
Iw = (X, - .t,)/iii = 171.8/183 = 0.939, which is beyond the 1% point,
0.896, shown in table A 7 (ii) for n = 5.
To find 95% confidence limits for the reduction in number of worms
per rat due to the treatment, we use the formula
(X, - X,) - Iw iii :s; Il, - III :s; (X, - X,) + tw iii
171.8 - (0.613)(183):S; Il, - III :s; 171.8 + (0.613)(183)
60 :s; Il, - III :s; 284

The confidence inter'lal is wide, owing both to the small sample sizes and
the high variability from rat to rat. Student's t, used in example 4.9.3
for these data, gave closely similar results both for the significance level
and the confidence limits.
For two independent samples of unequal sizes, Moore (I) has given
tables for the 10%, 5%, 2%, and I % levels of Lord's test to cover all cases
in which the sample sizes n, and n, are both 20 or less.
The range method can also be used when the sample size exceeds 20.
With two samples each of size 24, for example, each sample may be divided
at random into two groups of size 12. The range is found for each group,
and the average of the four ranges is taken. Lord (3) gives the necessary
tables. This device keeps the efficiency of the range test high for samples
greater than 20, though the calculation takes a little longer.
To summarize, the range test is convenient for nonnal samples if a
5% to 10% loss in information can be tolerated. It is much used when
many routine tests of significance or calculations of confidence limits have
to be made. It is more sensitive than I to skewness in the population and
to the appearance of gross errors.
EXAMPLE 5.2.1··-ln a previous example the differences in the serum albumen found
by two methods A and B in eight blood samples were; 0.6. d.7, O.S, 0.9, 0.3, 0.5. -0.5,1.3
~. ~t \00 ml. A.\\\\\)' the tan.,'! methOO. to te~ the nun n'J?lth.eUs tl;w.t t\\(.te i.~ M oon.U'bt-:.......
difference in the. amount of serum albumen found by the two methods. Ans. t", = 0.:.\2.
p < 0.05.
EXAMPLE S.2.2~In this example. given by Lord (3), the data are the times taken for
an aqueous solution of glycerol to fall between two fixed marks. In five independent determi.
nations in a viscometer, these times were 103.5. 104.1. 102.7. 103.2. and 102.6 seconds.
For satisfactory calibration o'the viscometer, the mean time should be accurate to within
± 1/2 sec., apart from a l·in·20 chance. By finding the half·width of the 95% confidence
interval for J.t by (1) the I .. method, and (ii) the t method, verify whether this requirement is
satisfied. Ans. No. Both methods give ±0.76 for the half·width.
EXAMPLE 5.2.3--10 15 kernels of corn the crushing resistance of the kernels, in
pounds, ,ranged from 25 to 65 with a mean of 43.0. Another sample of) S kernels, harvested
at a different stage. ranged from 29 to 67 with a mean of 48.0 Test whether the difference
between the means is significant. Ans. No, t,.,::=- 0.128. Note that since the ranges of the
two samples ipdicate much overlap, one could guess that the test will not show a significanl
difference.
123
S.3-MediaD, pereeotiles, and order statistics. The median of a popu·
lation has the property that half the values in the population exceed it
and halffall short of it. To estimate the median from a sample, arrange
the observations in increasing order. When the sample values are ar-
ranged in this way, they are often called the ist, 2nd, 3rd ... order sta-
tistics. If the sample size n is odd, the sample median is the middle tenn
in this array. For example, the median of the observations 5, 1,8, 3, 4
is 4. In general, (n odd) the median is the order statistic whose number is
(n + 1)/2. With n even, there is no middle term, and the median is de-
fined as the average of the order statistics whose numbers are n/2 and
(n + 2)/2. The median of the observations I, 3,4, 5,7,8 is 4.5.
Like the mean, the median is a measure of the middle of a distribution.
If the distribution is symmetrical about its mean.. the mean and the
median coincide. With highly skewed distributions like that of income
per family or annual sales of firms, the median is often reported, because
it seems to represent people's concept of an average better than the mean.
This point can be illustrated with small samples. As we saw, the median
of the observations I, 3, 4, 5, 8 is 4, while the mean is 4.2. If the sample
values become I, 3, 4, 5, 24, where the 24 simulates the introduction of a
wealthy family or a large firm, the median is still 4, but the mean is 7.4.
Four of the five sample values now fall short of the mean, while only one
exceeds it. Similarly, in the distribution of incomes per family in a
country, it is not unusual to find that 65% of families have incomes below
the mean, with only 35% above it. In this sense, the mean does not seem
a good indicator of the middle of the distribution. Further, the sample
median in our small sample is still 4 even if we do not know the value of
the highe:;t observation, but merely that it is very large. With this sample,
the mean cannot be calculated at all.
The calculation of Ibe median from a large sample is illustrated from
the data in table 5.3.1. This shows for 179 records of cows, the number
of days between calving and the resumption of the oestrus cycle (16).
Many of the records are repeated observations from successive calvings of
the same cow. This raises doubts about the conclusions drawn, but the
data are intended merely for illustration.

TABLE 5.3.1
DlS.tlt.IBUTlON OF NUMBElt OF DAYS FROM CALVING TO fIRSt SUBSEQUENT OESTRUS
FOR A HOUTEIN·FRIESIAN Hem IN WlSCOJ'\o"SJN

CIa5$ limits 0.5--


20.' ...,
2O.!I- .... S-
....
... 5-- 80.5-- IOO.s.- 120.'- 14O.S- 160.5- 180.5- 200.5-

...._,
(days) 6O' tlO.5 220.5

• " ,. " " ,.


100.5 120.5 140.5 160.5 200.'


., " '" '" '"
\I 2

Cumulauw
''''_' B ... '" 177
'" ".
124 Chapter 5: Sh«tcuf and Non-parametric methods
The frequency rises to a peak in the class from 40.5 days to 60.5 days. The
day corresponding to the greatest frequency was called the mode by Karl
Pearson. There is a secondary mode in the class from 100.5 to 120.5 days.
This bimodal feature. as well as the skewness. emphasizes the non-
normalito/ of the distribution.
Since n = 179, the sample median is the order statistic that is 90th
from the bottom. To find this, cumulate the frequencies as shown in the
table until a cumulated frequency higher than 90 is reached-in this case
91. It is clear that the median is very close to the top of the 40.5--{)0.5 days
class. The median ;s found by interpolation. Assuming that the 50
observations in this class are evenly distributed between 40.5 and 60.5
days, the median is 49/50 along the interval from 40.5 days to 60.5 days.
The general formula is
gl
M = X, . + --.
f (5.3.1)

where

XL = value of X at lower limit of the class containing the median


= 40.5 days
9 = order statistic number of the median minus cumulative fre-
quency up to the upper limit of previous class = 9O - 41 = 49
J = class interval = 20 days
f = frequency in class containing the median = 50
This gives
. (49(20)
M = MedIan = 40.5 + ---so
= 60 days

The mean of the distrihution turns out to be 69.9 days. considerably


higher than the median because of the long positive tall.
In large samples of size n from a normal distribution (6). the sample
median becomes normally distributed about the population median
with standard error 1.253<1/)n. For this distribution, in which the
sample mean and median are eSlimates of the same quantity, the median
is less accurate than the mean. As we have stated, however. the chief
application of the median lies in non-normal distributions.
There is a simple method of calculating confidence limits for the
population median that is valid for any continuous distribution. Two of
the order statistics serve as the upper and lower confidence limits. These
are the order statistics whose numbers are, approximately (7).

+ I) z..;n
(n
---+--, (5.3.2)
2 - 2
125
where z is the normal deviate corresponding to the desired confidence
probability. for the sample of cows, using 95% confidence probability,
z'" 2 and these numbers are 90 ± .J179 = 77 and 103. The 95% confi-
dence limits are the numbers of days corresponding to the 77th and the
103rd order statistics. The actual numbers of days are found by adapting
formula 5.3.1 for the median.
(36)(20)
for 77: No. of days = 40.5 + 50 = 55 days

(12)(20)
For 103: No. of days = 60.5 + 32 = 68 days

The population median is between 55 and 68 days unless this is one of


those unusual samples that occur about once in twenty trials. Th.e reason-
ing behind this method of finding confidence limits is essentially that by
which confidence limits were found for the binomial in chapter 1. For-
mula 5.3.2 for finding the two-order statistics is a large-sample approxima-
tion, hut is adequate for practical purposes down to " = 25.
In reporting on frequency distributions from large samples, investi-
gators often quote percentiles of the distributions. The 90th percentile of
a distribution of students' I.Q. scores is the I.Q. value such that 9O~ of
the students fall short of it and only 10"10 exceed it.
In estimating percentiles, a useful result (7) is that in any continuous
frequency distribution the Plh percentile is estimated by the order statistic
whose number is (n + I)P/loo. For the 179 cows, the 90th percentile is
estimated by order statistic whose number is 1= (180)(90)/100 = 162.
By again using formula 5.3.1, the number of days corresponding to the
162nd order statistic is found as
120.5 + (4)(20)(([ = 128 days
EXAMPLE 5.3.1--From a sam~e whose values are 8, 9, 2,7,3, 12, 15, estimate (i) the
median. (ii) the lower quartile of the population (the lower quartile is the 25th percentile,
having one-quarter of [he popUlation below it and three-quarters above), (iii) the 80th per-
centile. Ans. (j) 8, (ii) 3, (iii) 13.2. For the 80th percentile, the number of the order statistic
is 6.4. Since tht; 6rh and 7th order statistics have values 12 and !5, f(;-:.pectively, linear
interpoiation gives 13.2 for the 6.4th order statistic. Note that from this small sample we-
cannot estimate the 90th percentile, beyond saying that our estimate exceeds J 5.

5,4--The siga test, Often there is no scale for measuring a character,


yet one believes that he can distinguish grades of -I)Ierit. The animal
husbandman, for example, judges body conformation, ranking the in-
dividuals from high to low, then assigning ranks I, 2, ... n. In· the same
way, the foods expert arrays preparations according to llavor or palat-
ability. If rankings of a set of individuals or treatments are made by a
random sample of judges, inferences can be made about the ranking in
the population from which the sample of judges was drawn; this despite
the fact !bat the parameters r>f the distributions cannot be written down.
126 Cbvp#ef' 5: SIoortcuf and Non pm &nefric MefhocIs
First consider the rankings of two products by each of m judges. As
an example, m = 8 judges ranked patties of ground beef which had been
stored for 8 months at two temperatures in home freezers (17). Flavor wa~
the basis of the ranking. Eight of the patties, one for eachjudge, were kept
at O'F.; the second sample of 8 were in a freezer whose temperature
fluctuated between 0° and 15°F. The rankings are shown in table 5.4.1.
TABLE 5.4.1
R"NKJNGS Of THE FLAVOR OF PAlM OF PATTIES Of GROUND BEEF
(Eight judges. Rank I is high: rank 2, low)

Sample I Sample 2
Judge OCF. Fluctuated

A 2
B 2
C 2 1
D 2
E 2
F 2
G 2
H 2

There are two null hypotheses that might be considered for these
data. One is that the fluctuation in temperature produces no detectable
difference in flavor. (If this hypothesis is true, however, one would expect
some of the judges to report that their two patties taste alike and to be un-
willing to rank them.) A second null hypothesis is that there IS a dif-
ference in flavor. and that in the population from which the judges were
drawn, half the members prefer the patties kept at oaF. and half prefer
the other pattIes. Both hypotheses have the same consequenCj: as regards
the experimental data-namely, that for any judge in the sample, the
probability is 1/2 that the O°F. patty will be ranked I. The reasons for
this statement are different in tlie two cases. Under the first null hy-
pothesis, the probability is 1/2 because the rankings are arbitrary; under
the second, because any judge drawn into the sample has a probability
1/2 of being a judge who prefers the oaF. patty.
In the sample,? out of 8 judges preferred the OOF. patty. On either
null hypothesis. we expect 4 out of 8. The problem of testing this hy-
pothesis is exactly the same as that for which the X2 test was introduced
in sections 1.10, 1.11 of chapter I. From the general formula in section
1.12.

2 (7 - 4)2 (I - 4)2
7. =, 4 + 4=45 '

When testing the null hypothesis that the probability is 1/2, a slightly
simpler version of this formula is
127
,(a-b)' (7-1)2
1.
. = --
n- =
8 = 4.5
where a and b are the observed numbers in the two classes (O°F. and
Fluctuated).
Since the sample is small. we introduce a correction for continuity.
described in section 8.6, and compute X' as

/' = (1 0 - b~ - 1)2 = (6 ~ I)' = 3.12, P = 0.078

The expression 10 - bl- I means that we reduce the absolute value of


(a ~ /J) by I before squaring. The test indicates non-significance, though
the decision is close.
In this example we used the X' .test, in place of the I-test for paired
samples. because the individual observations, instead of being distributed
normally, .take only the values I or 2, so that the differences within a
pair are either + 1 or - I. The same test is often used with continuous
or discrete data, either because the inn~stigator wishes to avoid the as-
sumption of normality or as a quick substitute for the I-test. The pro-
cedure is known as the sign test (8). because the difTert'llC'cs between the
members of a pair are replaced by their sign~ (+ or. - ). the size of the
difference being ignored. In the formula for .x'. a and b are the numbers
of + and - signs, respectively. Any zero dilferc!nce is omitted rrom the
test. so that n = a + h.
When the sign test is appIJed to a variate X that has a continuous of'
discrete distribution, the null hypothesis is that X has the same distribu'
tion under the two treatments. But the null hypothesis does not necd to
specify the shape of this distribution. In the I-test. on the other hand. the
null hypothesis assumes normality and specifies that the parameLer
iJ (the mean) is equal for the two treatments. For thi' reason theHest
is sometimes called a parametric test. while the sign test is called non-
parametric. Similarly, the median and other order statistin <lre no~
parametric estimates. since they estimate percentiles of any continuous
distribution without our requiring to define the shape of the distribution
specifically by means of parameters.
In sampling from normal distributions the el1icieney of the sign test
relative to the I-test is about 65%. This statement imp)ie. that if the null
hypothesis is false, so that the means of the two pnpulations differ by an
amount .I, a sign test based on I g pairs and a I-test based on 12 pall' have
about the same probability of detecting this by finding II significant dif-
ference.' The sign test saves time at the expense of a loss of sensitivity in
the test.
For numbers or pairs up to 20. table A 8 (p. 554). intended for quick
reference, shows the smaller number of like signs required for significance
J28 Chapter 5: Shortcut and Non-parametric Method.
at the 1/~, 5~~~, and 10~'~ levels. For instance, with 18 pairs, we must have
4 or less of one sign and 14 or more of the other sign in order to attain S%
significance. This table was computed not from the X' approximation
but from the exact binomial distribution. Since this distribution is dis-
continuous, we cannot find sample results that lie precisely at the 5% level.
The significance probabilities, which are often substantially lower than
the nominal significance levels, are shown in parentheses in table A 8.
The finding of 4 negative and 14 positive signs out of 18 represents a
significance probability of 0.031 instead of the nominal 0.05. For one-
tailed tests these probabilities should be halved.
EXAMPLE 5.4.1----0n being presented with a choice between two sweets, differing
tn color but Qtherwise identical. 15 out of 20 children chose color B. Test whether this is
evidence of a general preference for B (i) by Xl, (ii) by reference to tahle A 8. Do the results
agree'?
EXAMPLE 5.4.2--Two ice creams were made with different flavors but otherwise
similar. A panel of 6 expert dairy industry men all ranked Ravor A as p~ferred. Is this
statistical evidence that the consuming public will prefer A?
EXAMPLE 5.4.3- To illustrate the dIfference berween the sign test and the (-test in
extreme situations. consider the two samples. I!a.ch of9 pairs. in which the actual differences
are as follows. Sample I: - 1. L 2, 3. 4. 4, 6, 7, \0. Sample II: I. I. 2. 3, 4. 4, 6, 7. - 10.
rn both sample\; the sign test indicates significance at the 5°>~ level, with P ==. 0.039 from
table A 8. In sample I. in which the negative sign occurs for the smallest difference. we
find [ = 3.618. with 8 d/. the ~ignificance probability being 0.007. In sample II. where the
largest difference is the one with the negative sign. t = 1.125, with P = 0.294. Verify that
Lord's test shows t ... == 0.364 for sample I and 0.118 for sample fI, and gives v('rdicts in
good agreement with the I-tes!. When the aberrant signs represent eUreme observations
the sign test and the Hest do nOl agree well. This does not necessarily mean that the sign
test is at fault: if the extreme observation . . . ere caused by an undetected gross error. the
verdict of the '·test might be misleading.

5,S-Non-parametric methods: ranking of differtnces between mea-


surements. The signed rank test, due to Wilcoxon (2), is another sub-
stitute for the I-test in paired samples. First, Ihe absolute values of the
differences (ignoring signs) are ranked, the smallest difference being as-
signed rank I. Then the signs are restored tothe rankings. The method is
illustrated from an experiment by Collins et al. (9). One member of a pair
of corn seedlings was treated by a small electric current, the other being
untreated. After a period of growth, the differences in elongation
(treated-untreat.d) are shown for each of ten pairs.
In table 5.5.1 the ranks with negative signs total 15 and those with
positive signs total 40. The test criterion is the smaller of these (otals, in
this case, 15. The ranks with the less frequent sign will usually, though
not always, give the smaller rank total. This number, sign ignored, is
referred to table A 9. For 10 pairs a rank sum S 8 is required for re-
jection at the S% level. Since 15 > 8, the data suppon the null hypothesis
that elongation was unaffected by the electric current treatment.
The null hypothesis in this test is that the frequency distribution of
tbe original measurements is the same for the treated and untreated mem-
'29
TABLE 5.S. 1
EXAMPLE OF WILCOXON'S SIGNED RANI:. TEST
(Differences in e lo ng.ation of treated and untrea ted seedlings)

Palf Difference (mm.) Signed Rank

I
2
6.0
(..3 ,
5

3 10.2 7
4 23.9 10
S 3.1 3
6 6.8 6
7 - 1.5 - 2
8 - 14.7 -9
9 - 3.3 - 4
10 11.1 t!

bers of a pair, but as in the sign test the shape of this frequency distribu-
tion need not be specified. A consequence of this null hyPothesis is that
eaoh rank is equally lik.ely to have a + or a - sigp. The frequency dis-
tribution of the smaller rank sum was worked out by the rules of prob-
ability as described by Wilcoxon (2). Since this distribution is discon-
tinuous, the significance probabilities for the entries in table A 9 are not
exactly ~% and I %, but are close enough for practical purposes.
lfthe two or more differences are equal. it is often sufficiently accurate
to assign to each ofthe lies the average o(the ranks that would be assigned
to the group. Thus, if two differences are tied in the fifth and sixth posi-
tions, assign rank 5 1 '2 to each of them.
If the number of pairs n exceeds 16. calcuhite the approximate normal
deviate
Z = (ill - TI - Wcr
where T is the smaller rank sum. and
Il = n(n + 1)/4 cr = J (2n + 1)/i/6
The number - 1/ 2 is a correction for continuity. As usual, Z > 1.96.sig-
ames rejection at the 5% level.
EXAMPLE 5.5. I - From two J·sbaped populations distributed like chi-square with
d.f. = I (figure 1.13.1). two samples of Ii "" I0 were drawn1lnd paired al random :

Sample I 1.98 3.30 5.91 1.05 1.01 1.44 3.42.. 2. 17 1.37 1.13

Sample 2 0.33 0. 11 0.04 0.24 1.56 0.42 0.00 0.22 0.82 2.54

Difference 1.65 3.19 5,87 0.81 - 0.55 1.02 3.42 1.95 0.55 -1.4l

Rank 6 8 10 ' 3 - 1.5 4 9 7 1.5 - S


The dilferellCe between lhe population means was I. Apply the sisned rank test. ADs.
The smallest two absolute diffcr~ are tied. SO each is assigned the tank (I + 2){2 - 1.5.
130 Chapter 5: Shorteut oncI Noft-par_i< Method.
The sum of the negative ranks is 6.5, between the critical sums, 3 and 8, in table A 9. Ho
1s rejected with P = 0.04, approximately.
EXAMPLE 5.S.2-If you had not known that the differences in the foregoing example
were from a non-normal population, you would doubtless have applied the I-test. Would
you haved,fawn any different conclusions? Ans. t = 2.48. P = 0.04,
EXAMPLE 5.5.3- Apply the signed rank test to samples I and II of example 5.4.3.
Verify that the results agree with those given by the I-test and not with those given by the
5ign test. Is this what you would expect?
EXAMPLE 5.5.4---For 16 pairs, table A 9 states that the 5% level of the smaller rank
sum is 29. the exact probability being 0.053. Check the normal approximation in this case
by showing that f' = 68, a = 19.34. so that for T = 29 the value of Z is 1.99, corresponding to
a significance probability of 0.041.

S.6-Noo-parametric methods: ranking for ""paired measurem...ts.


Turning now to the two·sample problems of chapter 4, we consider rank-
ing as a non-parametric method for random samples of measurements
which do not conform to the usual models, This test was also developed
by Wilcoxon (2), though it is sometimes called the Mann-Whitney test
(II). A table due to White (12) applies to unequal group sius as well as
equal. All observations in both groups are put into a single array. care
being taken t.O tag the numbers of each group so that they can be dis-
tinguished. Ranks are then assigned to the com billed array Finally, the
smaller sum of ranks, T, is referred to table A 10 to determine signifi-
cance. Note that small values of T cause rejection.
An example is drawn from the Corn Borer project in Boone County,
Iowa. It is well established that, in an attacked field, more eggs are de-
posited on tall plants than on short ones. For illustratibn we took records
of numbers of eggs found in 20 plants in a rather uniform field. The
plants were in 2 randomly selected sites. 10 plants each. Table 5.6.1 con-
tams the egg counts.

TABLE 5.6.\
NUMBER Of CORN BOREl. EGGs ON CORN PLANTS. BOONE COUNTY. IOWA, 1950

Height of Plant Number of Eggs

Less than 23" o \4 \8 o 31 o o 0 II o


More than 23" 37 42 12 32 105 84 15 47 51 65

In. years such as 1950 the frequency distribution of number of eggs


tends to be J-shaped rather than normal. At the low end. many plants
have no eggs. but there is also a group of heavily infested plants. Normal
theory cannot be relied upon to yield correct inferences from small
samples.
For convenience in assigning ranks. tbe counts were· rearranged in
increasing order (table 5.6.2), The counts for the tall plants are in bold-
131
TABLE 5.6.2
EGG CoUNTS ARRANGED IN INCREASING ORD£k, WITH RANKS
(Boldface type indicates COUDts on plants 23" or more)

Count O. O. O. O. O. O. 11, 11, 14. 15, IS, 31


Rank 3t. 31. 31. 31. 31, 3l, 7, 8. 9. 10, ll. IZ

face type. The eight highest counts are omitted, since they are all on tall
plants and it is clear that the small plants give the smaller rank sum.
By the rule suggested for tied ranks, the six ties are given the rank
3t, this bemg the average of the numbers I to 6. In this instance the aver-
age is not necessary, since aU the tied ranks belong to one group; the sum
of the six ranks, 21, is all that we need. But if the tied counts were in both
groups, averaging would be required.
The next step is to add the '" rank numbers in the group (plants less
than 23 in.) that has the smaller Sum.
T=21 +7+9+ II + 12=60

This sum is refe!'red to table A 10 with", = '" = 10. Since Tis less than
Tom = 71, the null hypothesis is rejected with P S; 0.01. The anticipated
conclusion is I.hat plant height affects the number of eggs deposited.
When the samples are of unequal sizes n" n" an extra step is required.
First, find the total T, of the ranks for the sample that has the smaller
size, say n,. Compute T, = "'(", + n, + 1) - T,. T"en T, which is re-
ferred to table A 10, is the smaller of T, and T,. To illustrate, White
quotes Wright's data (10) on the survival times, under anoxic conditions,
ofthe peroneal nerves of 4 cats and 14 rabbits. For the cats, the times were
25,33,43, and 45 minutes: for the rabbits, 15, 16, 16, 17,20,22,22,23,
28,28,30,30,35, and 35 minutes. The ranks for the cats are 9, 14, 17,
and 18, giving T, = 58. Hence, T, = 4(19) - 58 = 18, and is smaller
than T" so that T = 18. For n, = 4,", = 14, the 5% level of Tis 19. The
mean survival time of the nerves is significantly higher for the cats than
for the rabbits.
For values ofn, and", outside the limits of the table, calculate

Z= <I" - TI - t)/a,
where

The approximate normal deviate Z is referred to the tables of the normal


distribution to give the significance probability P.
Table A 10 was calculated from the assumption that if the null
hypothesis is true, the n, ranks in the smaller sample are a rand~m selec-
tion from the ('" + ",) ranks in the combined samples.
132 C".,., 5: Shortcut """ Non-parametric M.tItocI.
S.7-Comparison of rank and normal tests. When the I-test is used
on non-normal data, two things happen. The significan~ probabilities
are changed; the probability that t exceeds to.o, when Ihe null hypothesis
is Irue is no longer 0.50, but may be, say. 0.041 or 0.097. Secondly, the
sensitivity or power of Ihe test in finding a significant result when the null
hypothesis is false is altered, Much of the work on non-parametric
methods is motivated by a desire to find tests whose significance proba-
bilities do not change and whose sensitivity relative to competing tests
remains high when Ihe dala are non-normal.
With the rank tests, the significance levels remain the same for any
continuous distribution, except that they are affecled to some extent by
ties, and by zeros in the signed rank test. I n large normal samples, the
rank tests have an efficiency of about 95"~ relative to the I-tesf. (13), and
in small normal samples, the signed rank test has been shown (4) to have
an efficiency slightly higher than this. With non-normal data from a
continuous distribution. the efficiency of the rank tests relative 10 t never
falls below 86% in large samples and may be much greater than 100% for
distri buti<lns that have long tai]$ (13). Since they are relatively quickly
made, the rank tests are highly useful for the investigator who is doubtful
whether his data can be regarded as normal.
The beginner may wi,h to compute both the rank tests and the t-test
for some of his data to see how they compare. Needless to say. the prac-
tice of quoting the test that agrees with one's predilections vitiates the
whole technique.
As has been Slated previously, most investigations. after the prelimi-
nary slages, are designed to estimate the sizes of differences rather than
simply to test null hypotheses. The rank methods can furnish estimates
and confidence limits for the difference between two treatments (see
examples 5.8.\ and 5.8.2). The calculations require no assumption of
normality, but are a little tedious. Some work has also been done _in ex-
tending rank methods to the more complex types of data that we shall meet
in later chapters, -though the available techniques still fall short of the
ftexibility of the slanaard methods based on normality.
5.8-Scales with limited values_ In some lines of work the scales of
measurement are restricted to a small number of values, perhaps to 0, I,
2 or I, 2, 3, 4, 5. Investigators are sometimes puzzled as to how to test the
differences between two treatments in this case, because the data do not
look normal, while rank methods usually involve a substantial number
of zeros and ties. We suggest that the ordinary Hest be used, with the in-
clusion of a correction for continuity. To illustrate, consider a paired
test in which the original data are on a 0, I, 2 scale. The differences be-
tween the members of a pair can then assume only the values 2, I, 0, - I,
and -2.
With 12 pairs, suppose tha( the differences D between two treatments
A and Bare 2, 2, 2. I, 1, 1,0,0,0,0, -I, -I. Then:ED = 7. There is a
133
test, called Fisher's randomization test (15), that requires no assumption
about the form of the basic distribution of these differences, The argu-
ment used is that if there is no difference between A and B, each of the
12 differences is equally likely to be + or -. Thus, under the null
hypothesis there are 2 1 ' = 4.096 possible sets of sample results. Since,
however. +0 and -0 are the same. only 28 = 256 need be examined.
We then count how many samples have r.D as great as or greater than 7,
the observed r.D. It is not hard to verify that 38 samples are of this kind
if both positive and negative totals are counted so as to provide a two-
tailed test. The significance probability is 38/256 = 0.148. The null
hypothesis is not rejected by the randomization test.
With this test the investigator must work out his o\l1n significance
probability. From his writings it seems clear that Fisher did not intend
the test for. routine use, but merely to illustrate that a test can be made
if A and B were assigned to the members of each pair by randomization.
For scales with limited numbers of values, numerous comparisons of
the results of this test and the Hest show that they usually agree welf
~nough for practical purposes. In the randomization test, however. the
possible values of r.D jump by 2's. Our observed r.D is 7. We would
have :ED = 9 if only one I had a - sign, and r.D = 5 if three I's had a -
.,ign. To apply the correction for continuity, we compute t, as

jr.Dj - 1 6
t = --- = = I <'97
, nSn (12)(0.313) ._,

where sn = 0.313 is computed in the usual way. With II df" Pis 0.138.
in good agreement with the randomization test. The denominator of Ie
is ]hc standard error of r.D. This may be computed either as "So or as
\' "so·
In applying the correction for continuity. the rule is to hnd the next
highest value of r.D that the randomization set provides. The numerator
of I, is halfway between this value and the observed r.D. The values of
I.D do not always jump by 2's.
With two independent samples of size n the randomization test
assumes that on the null hypothesis the (2n) observatloQs have been
divided at random into two samples of n. There are (2n)1/(n')' cases.
To apply the correction, find the next highest value of r.D I - I.D,. If
one sample has the values 2, 3. 3, , and the other has 0, O. 0, 2. we have
I.DI = II, r.D, = 2. giving r.D I - r.D, = 9. The next highest value is 7.
given by the case2. 2. 3. 3 and 0, 0, 0, 3. Hence. the numerator of I, is 8.
The general formula for I, is
134 Chapter 5: SItortcut and Non-parametric Irletltod.
with 2(n - 1) dj., where SI' and s,' are the sample variances and c is the
size of the correction.
With small samples that show little overlap, as in this example, the
randomization test is easily calculated and is recommended, because in
such cases I~ tends to give too many significant results. With sample
values of 2, 3, 3, 3 and 0, 0, 0, 2, the observed result is the most extreme
of the 8!/(4!)' cases. The randomization provides 4 cases like the ob-
served one in a two-tailed test. P is therefore 4170 = 0.057. The reader
may verify that I, = 3.58, with 6 df. and P near 0.01.
EXAMPLE 5.8.1-10 Wright's data, p. 131, show that if the survival time for eae!l
cat is reduced by 2 minutes, the value of Tin the signed rank test becomes 18 1/2. whi!elf
the cat times are reduced by 3 minules, T = 21. Show further that if 23 minutes are sub·
tracted from each cat, we find T = 20 1/2. while for 24 minutes. T = 19. Since TOM = 19.
any hypothesis which states that the average survival time of cats exceeds that of rabbits
:'ya D,gure between 3 and 23 mmutes is accepted in a 5~,~ test. The limits 3 and 23 minutes
are 95%, confidence limits as found from the rank sum test.
~AMPLE 5.8.2--ln a two·samplecomparison. the eSfimate of the difference between
the two populations appropriate to the use of ranks is the median of th1: difference~ - YJ•
where Xi and Y-; denote members of the first "nd second sample~. In Wrighrs data, with
nj == 4, n z == 14. there are 56 differences. Show that the median is 12.5. (You should be
able to shoncut the work..)
EXAMPLE 5.8.3--1n a paired two--sample test the teh values of the differences D were
3, 3, 2. I, I, l, l, O. O. -1. Show that the randomization test gives P =: 3/64 "'" 0.047 while
the' value of t, corrected for continuity. is 2.451,- corresponding to a P value of about 0.036.
REFERENCES
1. P. G. MOORE. Biometrika, 44:482 (1957).
2. F. WILCOXON. BiomeIrics Sui.. 1 :80 (1945).
3. E. LORD. BiomeJrikD. 34;56 (1947).
4. K. C. S. PtLLAt. Ann. Marh. Statist., 22:469 (1951).
S. C. M. HAlU\ISON. Planl Physiol., 9:94(1934).
6. M. G. K~ND.tLL, and A. STUART. The Advanced Theor_v of Statistics. Vol. I, 2nd ed.
Charles Griffin, London (1958).
7. A. M. MOOD and F. A. GIt.AYBILL. IntrOQUction to the Theory of Statistics, 2nd ed..,
p.408. McGraw-Hili, New York (1963).
8. W. J. DIXON and A. M. MOOD. J. Amer. Statist. A.ss .• 41 :557 (1946).
9. O. N. COLLINS, etal J. Agric. Res., 38:585 (1929).
10. E':B. WRIGHT. Amer. J. Physiol.• 147:18 (1946).
I\. H. B. MANN and D. R. WHITNEY. Ann. Math. Statist.,18:50(1947).
12. C. WHITE. Biometrics, 8:33 (1952).
13. 1. L. HODGES and E. L. LEHMANN. Ann. Math. Statist., 27:324 ((956).
14. J. KLOTZ. Ann. Math. SIDlisl .• 34:624(1963).
15. R. A. FISHER. The Design of Experiments. 7th ed., p. 44. Oliver and Boyd, Edinburgh
(1960).
16. A. B. CHAPMAN and L. E. CASIDA. J. AgTie. Res.• 54:417 (1931).
17. F. EHRENKIlANTZ and H. ROIWlTS. J. Home Eco"., 44:441 (1952).
* CHAPTER SIX

Regression

6.1-lntroduction. In preceding chapters the probl~ms considered


have involved only a single measurement on each individual. In this
chapler. attenlion is centered on the dependence of one variable Y on
another variable X. In mathematics Y is called a function of X, but in
statistics the term regression is generally used to describe the relationship.
The growth curve of height is spoken of as the regression of height on age:
in toxicology the lethal effects of a drug are described by the regression of
per cent kill on the amount of the drug. The origin of the term regression
will be explained in section 6.16. To distinguish the two variables in
regression studies, Y is sometimes called the dependent and X the inde-
pendent variable. These names are fairly appropriate in the toxicology
example, in which we can think of the per cent kill Y as being caused by
the amount of drug X, the amount itself being variabie at the will of the
investigator. They are less suitable though still used, for example. when
Y is the weight of a man and X is his maximum girth.
Regression has many uses. Perhaps the objective is only to learn if Y
does depend on X. Or~ prediction of Y from X may be the goal. Some
wish to determine the shape of the regression curve. Others are con-
cerned with the error in Y in an experiment after adjustments have been
made for the elfect of a related variable X. An investigator has a theory
about cause and elfect, and employs regression to test this theory. To
satisfy these various needs an extensive account of regression methods is
necessary. '.
I n the next two sections the calculations required in fitting a regres-
sion are introduced by a numerical example. The theoretical basis of these
calculations and the useful applications of regression are taken up in sub-
sequent sections.
6.2.-The regression of blood pressure on age. A project "The Nutri-
tionai Status of Popuiation Groups" was set up by the Agricultural
Experiment Stations of nine midwestern states. From the facts learned we
have extracted data on systolic blood pressure among 58 women over 30
years of age, a random sample from a region near Ames. iowa (l). For
135
136 Cbapter 6: It. . ...:0..
present purposes, the ages are grouped into lO-year classes and the mean
blood pressure calculated for each class. The results are in the first two
columns of table 6.2.1.
TABLE 6.2.1
MEAN SYSTOLIC BLOOD PussUk£ OF S8 WOMEN IN IG-YIAJ. AGE CUssI!S

Midpoint of McaaBlood Deviations. From


Ageel.... PressUft M.... Squons Products
X Y x y x' y' xy

35 114 -20 -27 400 719 S40


45 124 -10 --17 100 289 170
55 143 0 2 0 4 0
65 ISS 10 17 100 289 170
75 166 20 25 400 625 .lOCI

SUm 275 705 0 0 1.000 1,936 1,3B()


Mean 55 141

l:xy 1.380
Sample regression coefficient: b = - , = 000 = 1.38 units of blood. pnMUR per year
:Ex I.

As in most regression problems, the first thing to do is to draw a graph,


figure 6.2.1. The independent variable X is plotted along the horizontal
axis. Each measure of the dependent Y is indicated by a black circle
above the corresponding X. Clearly, the trend of blood pressure with age
is upward and roughly linear.
The straight line drawn in the figure is the .rample regression of Yon X.
Its position is fixed by two results:
(i) .[t passes through the point O'(X, f), the point determined by the
mean of each sample. For the blood pressures this is the point (55, 141).
(ii) Its slope is at the rate of b units of Y per unit of X. where b is
the sample regression coefficient. Writing x = X - X and y = Y - Y.
b = r.xy/Lx z. The numerator of h is a new quantity-the sum of products
of the deviations, x and y. In table 6.2.1 the individual values of X Z have
been obtained in the fifth column and tho"" of xy in the seventh column.
[n section 6.3 a quicker method of calculating b will be given. For the
blood pressures, h = + 1.38. meaning that blood pressure increases on
the average by 1.38 units per year of age.
The sample regression equation of Y on X is now written as
f = f +'bx,
or,
.~ = bx.
where Y IS the estimated value and y the estimated deviation of Y cor-
responding to any x-deviation. If x = 20 years. y = (1.38) (20) = 27.6
units of blood pressure.
137

,
.ro
ReI)r.ssion of Blood
P,...ur. on ~


,••
• "0
l
130
8
iii
120

"

'L
00 " :.a'
,
!oS
,
4$
j.. ,
65
I
75
.. X
AQ. in '(e-ort

fIG. 6.2.I-Samplt regression of blood pressure on age. The broken lines indicate omis-
sioD of the lower parts of the scala. in order to clarify the- reb.tions in the pans occupied
by the data.

This equation enables us to compl.le figure 6.2.1 by drawing the


sample regression line. Layoff 0' M = 20 years to Ihe right of 0', th.n
erect a perpendicular, MP = 27.6 units of blood pressure. The line O'P
then has the slope, 1.38 units of blood pressure per year.
1n terms of the original units. the sample regression equation is
f - Y = !>(X - X)
For the blood pressures. this breomes
f - 141 = 1.38 (X - 55)
or
f = 141 + 1.38 (X - 55)
= 65.1 + 1.38X
If X = 75 is entered in this equat.on. f becomes 65.1 + (1.38)(75) = 168.6
units of blood pressure. The corresponding point. (75, 168.6). is shown
as P in the figure.
We can now compare the sample points with the corresponding f to
138 C/,opter 6: lI.gr...icHt
get measures of the goodness affit of the line to the data. Each X is sub-
stituted in the regression equation and f calculated. The five results are
recorded in table 6.2.2. The deviations from regression. Y - f = d,. ..
measure the failure of the line to fit the data. In this sample, 45-year-old
women had below average blood pressure and 65-year-olds had an excess.
TABLE 6.2.2
CALCULATION OF f AND DEVIATIONS FROM REGRESSION, dp. = Y- r
(Blood pressure data)

Midpoint of Mean Blood Estimated Blood Deviation From Square of


Age Class
X
Pressure
y
Pressure
t
Regression
Y - r = drs
Deviation
d,.Jt.
,
35 114 113.4 0.6 0.36
45 124 127.2 -3.2 10.24
55 143 141.0 2.0 4.00
65 158 154.8 3.2 10.24
75 166 168.6 -2.6 6.76

Sum Id)"x az:: 0.0 Idr }:= 31.60

The sum of squares of deviations. :Ed,.x 2 = 31.60, is the basis for an


estimate of error in fitting the line. The corresponding degrees of freedom
are n - 2 = 3. We have then,

s,.x' = :Ed,.//(n - 2) = 10.53.

where s,..x 2 is the mean square deviation from regression. The resulting
sample standard det'iationfrom regression,

s"x = ,)s,'..' = 3.24 units of blood pressure,

corresponds to s in single-variable problems. In particular, it furnishes a


sample standard deviation of the regression coefficient.
s" = Sy.,,/.JI.x 2
This is 3.241.j 1,000 = 0.102 units of blood pressure. with (n - 2) = 3 df.
A test of significance of b is given by
t = his., df. = n - 2
Applying this to the blood pressures,
t = 1.3810.102 = 13.5" df. = 3

Note: It is often convenient to denote significance by asterisks. A single


one indicates probabilities between 0.05 and 0.01: two indicate prob-
abilities equal to or less than 0.0 I.
139
Often there is little interest in the individual d,.. of table 6.2.2. If so,
rdy '" may be calculated directly by the formula,
rd,...' = ry' - ((rxy)'/I:x']
Substituting the blood pressure data from table 6.2.1,
I:.d,... 2 = 1,936 - [(1,380)'/1,000] = 31.60
as before.
EXAMPLE 6.2.I-Following are measurements on heights of soybean plants in a
field. a different random selection each week (2):

Age in weeks I 2 3 456 7


Height in centimeters 5 13 16 23 33 38 40

Verify thest results: X = 4 weeks, Y = 24 ems., .Ix 2 = 28, Iyl = 1.080• .Ixy = 172. Com-
putt the sample regr~si(}n, f = 6.143 X - 0.572 centimeters.
EXAMPLE 6.2.2-Plot on a graph the sample points for the soybean data. then con-
struct Ule sample regression line. Do the points lie about equally above and below the line?

EXAMPLE 6.2.3---Calculate s~ = 0.409 cltls./wk. Set the 95% confidence interval for
the population regression. ADS. 5.09 -_ 7.20 cms./wk. Note that sb' has 5 df

EXAMPLE 6.2.4---The soybean data constitute a growth curve. Do you suppose the-
population growth curve is really straight? How would you design an experiment to get a
growth curve of the blood pressure in l(lwa wonten?

EXAMPLE 6.2.5--Eighteen samples of soil were prepared with varying amounts of


inorganic phosphorus. X. Corn plants. grown in each soil. were harvested at the end of 38
days and analyzed for phosphorus content. From this was estimated the plant~availabJe
phosphorus in the soil. Nine of the observations. adapted for ease of computation. are
shown in this table:

Inorganic phosphorus in soil (ppm), X 4 5 9 13 II 23 23 2&


Estitnatedplant-availablephosphorus(ppm). Y 64 71 54 81 93 76 77 95 109

Calculate b = 1,417. Sb = 0.395,1 = 3.59"


,
6.3-8hortcut methods of computation in regression. Since regres-
sion computations are tedious, a calculating machine is almost essential.
In fitting a regression, the following six basic quantities must be obtained:
n, X, Y, I:.x', I:.y', I:.xy
You already know shortcut methods of computing I:.x' and I:.y' without
finding the individual deviations x and y. A similar method exists for
finding I:.xy, based on the algebraic identity

1:xy = 1: (X - X)( Y - Y) = I:XY - (I:.X)(I:. y)/n


140 Chapter 6: Regre......
Note that the correction term may be larger than :EXY, making :Exy nega-
tive. This indicates a downward sloping regression line.
In table 6.3.1 the regression of blood pressure on age has been
recomputed using these shortcuts.

TABLE 6.3.1
MACHINE COMPUT AnON Of A LINEAR REGRESsION

Age (years), X 35 45 55 65 75
Blood pressure (units), Y 114 124 143 158 166
EX = 275 I:Y= 705
X= 55 Y= 141
I:X' = 16,125 I: Y' = 101,341 l:XY = 40,155
(1:x)'I' = 15,125 n'l.
(I: = 99,405 (l:X)(l: n/. = 38.775

l:y2 =: 1,936 l:xy = 1,380

b = l:xy/I:x 1 = 1,380(1,000 = 1.38 units per year of age


f=Y+biX-XJ
= )41 + 1.38(X - 55) = 65.1 + 1.38X
Id,.; = 1:,..' - (l:xy)'/l:x' = 1,936 - (1,380)'/1.000 = 31.60
S,."l = I.d,.//(n - 2) = 31.60(3 = 10.53
5";1< = .jlO.53 = 3.245 units
'. = ".,J";1:x' = 3.2451";1.000 = 0.102
1= bls" = 1.38jO.1O:! = 13.5··, d.f. :::;" - 2 ",. 3

The figures shown under the sample data are all that need be written
down. In most calculating machines, :EX and :EX' can be accumulated
in a single run, :E Yand :E Y' in a second run and :EXY in a third, without
writing down any intermediate figures. With small samples in which X
and Y have no more than three significant figures, some machines will
accumulate :EX, :E Y, :EX'. 2:EXY, and :E y' in one run.
EXAMPLE 6.3.1-The data show the initial weights and gains in weight (grams) of IS
female rats on a high protein diet, from the 24th to 84th day of age. The point of interest
in these data is whether the gain in weight depends to some extent on initial weight. If so.
feeding experiments on female rats can be made more precise by taking account of the
initial weights of the rats, either by pairing on initial weight or by adjusting for differences
in initial weight in the analysis. Calculate b by the shortcut method and test its significance.
Ans. b .:: 1.0641. t = blsb = 2.02. with 13 d.f,. not quite significant at the S~~ level.

Rat Number

2 J 4 .~ 6 7 8 9 10 II 12 IJ 14 15

Initial weight, X 50 64 76 64 74 60 ~ 68 56 48 57 S9 46 4S 65

Gain. Y 128 159 158 119 133 112 96 126 132 118 107 106 82 103 104
,. ,
EXAMPLE 6.3.2~Speed records attained in the Indianapoi)$ Memorial Day auto-
mobile rac;cs 1911 - 1941 are as fonows in miles pet hour

Speed Speed Speed


Yea, X y Year X y Y- X Y
1911 0 74.6 1922 II 94.5 1932 21 104.1
1912 I 78.7 1923 12 9J.0 1933 2:! 104.2
)9J3 2 75.9 1924 13 98.2 1934 2.1 )()4.9
19)4 3 82.5 )925 14 lOLl 1935 24 106.2
1915 4 89.8 1926 15 95.9 1936 Z5 109.1
1916 5 83.3 1927 16 97.5 1937 26 113.6
)917 6 .... • 1928 11 99.5 1938 27 111.2
1918 7 " .. • )929 )8 97.6 1939 28 115.0
)919 8 88.1 1930 19 100.4 1940 29 114.3
\~1\) ~ 88.b \~)\ 1~ %.b \~\ 30 \lS.1
1921 10 89.6
._ No races.
T~ )'ears M(>'e bt:en 00<kd b}' subtracting i<.ll i (rom cltch. Calculate L,K.l = 2,325.02.
l:y' = 4,(139.81 • .by = 2.97J.23. f = 1.278.1" + 77.57 miles per hour.

6.4-The mathematical model in linear regressioD. In standard linear


regression. three assumptions are made about the relation between Yand
X:
I. For each selected X there is a normal distribution of Yfrom which
the sample value of Y is drawn at random. If desired. more than
one Y may be drawn from each distribution.
2. The population of values of Y corresponding to a selected X has a
mean p. that lies On the straight line p. ~ IX + /l(X - X) = IX + /1<.
where IX and fJ are parameters (to be explained presently).
3. In each population the Standard deviation of Y about its mean.
" + fJx has the same value. often denoted by <1yT
The mathematical model is specified concisely by the equation
Y = .~ + fix + ".
where r. is a random variable drawn from "v«(). ",.,t.
In this model. Y is the Slim of a r:Jftdom part. c, and a part fixed by x.
The fixed part; according to assumption number 2 above. determines the
means of the populations sampled. one rnean for each x. These means
lie ()n the straight line represented by p. '" " + /lx. the populurion regr.S-
siollline, The parameter" is the mean of the population that corresponds
to x = 0: thus. '1. specifi", the height of the line when X = X. Ii is the slo,w
of the regression line. the chanxe in Y per writ inc'r(·a.\'e in x. As for the
variable part of Y. c is drawn at random from KII). <1,.,): it is ;nd'penclenr
of x and normally d.istributed. as the symhoLK ,ignifies.
x
FlO. 6.4. I- Represcntation or the hnear reJre5sion model. The nermal di.c.lribut;on
of' Yabout the rcgrnsion line a + ~x is shOllf" (or f<,ur selected villucs orx.

Figure 6.4.1 gives a schematic representation of these populations.


For each of four selected values of X the normal distribution of Yabout
its mean Jl = (X + {Jx is sketched. These normal distributions would all
coincide if their means were superimposed.
For non-mathematicians. the model is best explained by an arith-
metical construction. Assign to X the values O. 2, 3, 7. ~. 10. as in table
6.4. 1. This is done quite..._arbitrarily; the manner in which X is fixed has
no bearing on the illustration.
Next. calculate X and the deviations. x = X - X, in column 2.
Now take (J = 0.5; this implies that the means of the populatiuns are
to increase one-half unit with each unit changl! in x . From this. ~olumn 3
is calculated.
Choose 0: == 4, meaning that at x = I) the population regression is 4
units above the X-axis.
The fixed X together with IX and {J determine the succes!lion or means
in column 4. These are indicated by open circles on the population regres-
sion line (the dotted line) of figure 6.4.2. So far all quantIties are fixed.
without sampling variation.
Coming now to the variable part of Y. the t: are drawn 31 random
from a table of random normal deviates with mean zero (Jr x =- I. The
values which we obtained were 1.1. - t .3. -1 .1. 1.0. O. and - 1.0. as
shown in column 50ftable6.4.I . Column 6contain~ the sample vft Iues of
Y. each item being the sum of th~ fixed pari in column 4 and the cor-
143
TABLE 6.4.1
CoNsnUCTlON OF ASAMPLE FROM Y ... (II + fJx + £, WITH (II - 4. fJ - 0.5,
AND £ DaAWN Faow %(0. 1)

x x fJx '" 0.5x « + fJx .. 4 + O.Sx e Y-«+fJ.x+£


(1) (2) (3) (4) (S) (6)

0 -5 -2.5 I.S 1.1 2.6


2 -3 -1.S 2.S -1.3 1.2
3 -2 -1.0 3.0 -1.J 1.9
7 2 1.0 5.0 1.0 6.0
8
10
3
5
I J.S
2.5

Cak:ulations of estimates for sample rearession, Yon X:


5.S
6.5
0.0
-1.0
S.S
5.5

1:X ... 30 IY - 22.7


x- 5
IA"l _ 221\
Y- 3.78
tXY ... 149.1 1:yl '" 108.31
(rX)'/II - 150 En; Yin = 113.5 (E Y)1/" - 85.88

J:~ - 76 J:x), - 35.6 Er. - 22.43


b "" J:x)lrr.~ - 35.6/16 - 0.<468
y .. 3.78 + 0.4fJ8 (X - 5) - 1.44 + O.<468X
Id,.,,' "" J:r - (l:xy)l~ '" 22.43 - (35.6)'/16 - 5.75
11,./ .. u,.,.,l/(n - 2) - 5.75/4 - 1.44. s,.~ - .11.44 - 1.20

respondi.D& random part in column 5. The S'amplc pomts are plottcci in


black circles in the figure.
The ca1culatioDB of Yand b are given under table 6.4.1 . The popula-

'1-5 - - - - Population Regression


- Sampl. RI9msion

Y·3.78


4 X
0 2 3 ~ 6 7 8 9 10
FlO. 6.4.2-PopuJation ~llrfSfion. p "" 4 "+' O.Sx. Sample resrtsSio ll. f' '" 3.78 + O.<468x.
,..... CIIapt., 6: lIegression
tion value ex = 4 is estimated by Y: 3.78. The sample regression line
passes through the point (X, f), (5, 3.78). The slope p : 0.5 is estimated
by b : 0.468. The solid line in figure 6.4.2 is the sample regression line.
It is nearly parallel to the population line but lies below it because of the
underestimation of ex. The discrepancies between the two lines are due
wholly to the random sampling of the e.
EXAMPLE 6.4.1-10 table 6.4.1, b = 0.468. Calculate the six d~\'iations from regres-
sion. d,.,., and identify each with tbe distance of the corresponding paim from the sample
rcgr.ession line. The sum of the deviations should be zero and the sum of their squares
about 5.75.
EXAMPLE 6.4.2-Construct a sample with Q; = 6and /1 = -1, The negative fJ means
that the regression will Slope downwards to the ri&ht. Take X = 1.2, ... 9, X being 5. By
using table 3.1.1, draw € randomly from. ""~(O. 5). Make a table showing the calculation of
the sample of Y. Graph the population regression and the sample points. Save your work
for further use.

6.S-Yasan estimator of I' : " + fJx. For any x, the computed value
1" estimates the corresponding I' = ~ + fJx. For example, we have already
seen that at x = 0 (for which X = 5), 1", "" f estimates 1', = ex. As another
example, at x = 2, for which X = 7, 1"7 = 1.44 + (0.468)(7) = 4.72, esti-
mates I' = 4 + (0.5)(2) = 5.
More generally,
1" - I' = (f -:x) + (b - P)x (6.5.1)
Thus, the difference between 1" and the corresponding !A has two sources,
both due to the random e. One is the difference between the elevations
of the sample and population regression lines (¥ - ex): the other, the dif·
ference between the two slopes (b - {3).
Estimates of I' are often made at an X lying between two of the fixed
X whose Y were sampled. For example, al X = 4,
1". = 1.44 + (0.468)(4) = 3.31,
locating a point on the sample regression line perpendicularly above
X = 4. Here we are estimating I' in a population not sampled. There is
no sample evidence for such an estimate; it is made on the cognizance of
the investigator who has reason to believe that the intermediate popUla-
tion has a I' Iymg on Ihe sampled regression, ~ + {ix.
Using the same argument, One may estimate I' at an X extrapolated
beyond the range of the fixed X. Thus, at X = 12,

Y" = 1.44 + (0.468)( 12) = 7.06


Extrapolation involves twO extra hazards. Since x tends to be large
for extrapolated values, equation 6.5.1 shows that the terrn (b - P)x may
make the difference ( Y - 1') large. Secondly (and this is usually the more
serious hazard), the population regression of means may actually be
curved to an extent that is small within the limits of the sample, but be-
145
comes pronounced when we move beyond these limits, so that results
given by a straight-line regression are badly wrong.
The value of 5' also enables us to judge whether an individual ob-
served Y is above or below its average value for the X in question. Look,
for example, at the first point on the left of the graph (figure 6.4.2).
Yo = 2.6, to be compared with 5'0 = 1.44. The positive deviation,
d,.o = Yo - Yo = 1.16, shows that Yo exceeds its estimated value by 1.16
units. Algebraically,
dy .% = Y - t = at + fJx + • - (Y + bx)
= (IX - YJ + (fJ - b)x +•
Thus, Y - t is, as would be expected, an estimate of the corresponding
normal deviate., but is affected also by the errors in Yand b. Ir. the con-
structed example, eo = 1.1, so that for this point Yo - to = 1.16 is close.
In larse samples, the errors in Y and b become small, and the residual
Y - Y is a good estimate of the corresponding •.
This examination of deviations from a fitted regression is often useful.
A doctor's statement: "For a woman of your age, your blood pressure is
normal," would imply that Y - 5' was zero, or near to it. A value of Y
that was quite usual in a woman aged 65 might cause a doctor to prescribe
treatment if it occurred in a woman aged 35, because for this woman
Y - 5' would be exceptionally high.
EXAMPLE 6.5.1~For your sample in example 6.4.2, calculate Y and b, then plOl the
sample regression line on your graph. Calculate the deviations d,.x and cO!llpare them
with the corresponding t. It is a partial check on your accuracy jf l:dy.~ = O.
EXAMPLE 6.5.2-Using the blood pressure data of section 6.2, estimate Il at age 30
years. Ans. \06.5 units.
EXAMPLE 6.5.3--Calculate Y,t = Y - bx. called adjusted Y. for each age group in
table 6.2.2. Verify your results by the sum, 1: YA. = 1: Y. Suggest several possible reasons
ror the differences amoDg adjusted Y.

6.6-The estimator of <1,./. As noted earlier, the quantity


s,./ ~ Ed,./ /(n - 2)
is an unbiased estimator of <7,./, the variance of the~e's. One way of re-
membering the divisor (n - 2) is to note that in fitting the line we have
two disposable constants, It and fJ, whose values we choose so to make the
d,.% as small as possible. If there are only two points (Y" Xj) and (Y2 , X,),
the fitted line goes through both points exactly. The dy • x and their sum of
squares are then zero, no matter how large the true uJlOX is. In other words,
there are no degrees of freedom remaining for estimating uJo./.
. In the constructed example (table 6.4.1), S,./ was found to be 1.44,
WIth 4 dj., as an estimate of <7,.%2 = I. This gives 1.20 as the estimate of
a,o" = 1.
The estimated variance in the original sample of values of Y is
s; = 22.43/5 = 4.49. By utilization of the knowledge of X, this variance
146 Chapter 6: Regression
is reduced to s,.x' = 1,44. It is sometimes said that a fraction
(4.49 - 1.4 . )(4.49, or about 68~~ of the variation in Y is associated with
the linear regression on X, the remaining 32% being independent of X.
This statement is useful when the objective is to understand why Yvaries
and it is known that X is one of the causes of the variation in Y.
The nature of Sy-x 2 is also made clearer by some algebra. For the
ith member of the sample,
t:j = Y; - ct - f3Xj : d y • xl = Y; - Y - hXj = Yi - bx,
Write
f, = Y, -- , - px, = lj - Y - bx, + (y - » + (b - PIX,
= (v, - bx,) + (I' - ,) + (bi - {J)x,
Square both sides ~nd sum over the n values in the sample. On the right
side there are three squared terms and three product terms. The squared
terms give
:E(y, - bx;l' + :E(Y - ~)2 + :E(b - P)'x/
The factors (Y - ,)' and (h - P)' are constant for all members of the
sample and can be taken outside the :E sign. This gives, for the squared
terms.
:E(y, - bx,)' + n( Y -_ a)' + (b - P)':Ex,'
Remark y, the three cross-product terms all vanish when summed
over the sample. For example,
2:E(y, - bx,)( Y - ~) = 2( Y - ~):E(v, - bx,) = 0
since :Ey, = 0 and l:x, = O. Further,
2l:(Y - ~)(h - Pix, = 2(Y -I%)(b - P)l:x, = 0,
2:E(y, - bx,)(b - Pix, = 2(b - P)l:x,(y, - bx,)
= 2(b - P)(r.XiJI, - br.x/)

which vanishes since b = I:x,y,/I:x,'. Thus, finally,


...
I:t,' = r.( Yo - 1% - pKj)2 = r.( Yo - Y - bx,}' + n( Y _ 1%)2
+ (b - P)'r.X,2 (6.6.1)
Rearranging,
r.d"x' = r.( Yo - Y- bx,)' = r.t,' - n( Y - 1%)2 - (b - P)'r.x,'
On the right side of this equation. each t, has mean zero and variance
Thus the term te/ is an estimate of na,./: The two subtracted
(J.~.• / .
terms on the right can be shown to be estimates of (1,./. It follows that
:Ed,.x' is an unbiased estimate of(n - 2)t1,./, and on division by (n - 2)
provides an unbiased estimate of l1 y./. This result, namely that sr/ is
unbiased, does not require the t, to be normally distributed. Normality is
required, however, to prove the standard tests of significance in regression.
147
6.7-The method of least squares. The choice of Yand b to estimate
the parameters ~ and fJ is an application of a principle widely used in
problems of statistical estimation and known as the method of least
squares. To explain this method, let 12 and Pdenote any two estimators
of ~ and fJ that we might consider. For the pair of observations (Y, X)
the quantity
Y - &- px
measures the amount by which the fitted regression is in error in estimating
Y. In the method of least squares, 12 and fJ are chosen so as to minimize
the sum of the squares of these errors, taken over the sample. That is,
we minimize
1:( Y - & - PX)2 (6.7.1)
About 150 years ago the scientist Gauss showed that estimators ob·
tained in this way are (i) unbiased, and (ii) have the smallest standard
errors of any unbiased estimators that are linear expressions in the Y's.
Gauss' proof does not require the Y's to be normally distributed, but
merely that the £'s are independent with means zero and variances u.,.,/
The result that (6.7.1) is minimized by taking ~ = f and (J ~ his
easily verified by quoting a previous result (6.6.1, p. 146). Since the proof
of the algebraic equality in (6.6.1) may be shown to hold for any pair of
values ~, fJ, the equation remains valid if we replace ~ by ~ and fJ by p.
Hence quoting (6.6.1),
1:( Y - ~ - /lX)2 ~ 1:( Y - Y - bX)2 + n(f _ ~)2 + (b - (J)21:x 2
The first term on the right is the sum of squares of the errors or residuals
that we obtain if we take ~ ~ Y and p = b. The two remaining terms on
the right are both positive unless" = Y and p = b. This proves that the
choice of Y and b minimizes (6.7.1).
6.8- The value of b in some simple cases. The expression for h.
1:xy/1:x 2 is unfamiliar at first sight. It is not obviously related to the
quantity fJ of which b is an estimate, nor is it clear that this is the estimate
that common sense would suggest to someone who hild never heard of
least squares. A general expression relating band fJ and lin examination
of a few simple cases may make b more familiar.
Denote the members of the sample by (Y" X,), where the subscript i
goes from I to n. The numerator of b is 1:X;Yi=1:X i(Yi - Y)~L'iYi
-l:Xj f. Since the term LXi Y vanishes, because I:Xj = 0, [he numerator
of b may be written 1:X'yi' Now substitute Yi = ~ + {lx, + "i' This
gIves
b-
1:Xi!~ + fJXi + eo) ~ fJ -
1:x/ [x,:e i fJ 1:xiei
- + - - = +--,
- l:x/ l:x/ LX/ l:x/
the term in a vanishing because l:x, = O. Thus b differs from fJ by a linear

10
148 CItopt.. 6: R.gr."""
expression in the 'i' If the ' i were all zero. h would coincide with p.
Further, since the'i have zero means in the population, it follows that b
is an unbiased estimate of p.
Turning to the simplest case, suppose that the sample consists of the
values (YI • I) and (Y" 2). The obvious estimate of the change in Y per
unit increase in X is Y2 - Yt - What does h give? Since X = t!. the
deviations are x I = - 1/2, x, = + 1/2. giving LX' = 1/2. Thus

b = -1 YI + ! Y, = Yl - Y"
-l:
in agreement.
With three values ( Y I , I), (Y" 2), (Y" 3) we might argue that Y, - YI
and Y, - Y, are both estimates of the change in r per unit change in
X. Since there seems no reason to do otherwise, we might average them,
getting (Y, - Y, )(2 as our estimate. To compare this with the least
squares estim~te. we have x\ = -]. X2 = O. x J = + I. This gives l:xY
= Y, - )'1 and L ,., '= 2, so that b = (Y, - YI )!2, again in agreement
with the common-sense approach. Notice that Y2 is not used in estimat-
ing the slope. Y, is useful in providing a ch""k on whether the population
regression line is straight. If it is straight, Y, should be equal to the
average of YI and Y" apart from sampling errors. The difference
Y, - ( YI + Y,)!2 is therefore a mea;ure of the curvature (if any) of the
population regression.
Continuing in this way for the sample ( YI , 1), (Y" 2), (Y" 3), (Y., 4),
we have thrre simple eSlimates of p, namely { Y, - YI ), (Y, - Y,), and
(Y. - Y,). If we average them as before, we get (Y. - Y, )/3. This is dis-
concerting, since this estimate does not use either Yz or Y3' What does
least squares give? The values of x are - 3/2, -1/2, + 1/2, and + 3/2
and the estimate may be verified to be
b = (3Y. + Y, - Y, - 3Y1 )/IO.
The least squares result can be explained as follows. The quantity
{ Y. - Y,)/3 is an estimate of p, with variance 2o,.-x '/9. The sampJe sup-
plies another independent estimate (Y, - Y,), with variance 2(1y.;. In
combining these two estimates, the principle of least squares weights
them inversely as their variances, assigning greater weight to the more
accurate estimate. This weighted estimate is

[9( Y. - Y,)/3 + (Y, - Y,)1I(9 + I) = (3Y. + Y, - Y, - 3Y1 )/IO = b


As these examples show. it is easy to construct unbiased estimates of fJ
by simple. direct methods. The least squares approach automatically
produces the estimate with the smallest standard error.
Remember that h estimates the average change in Y per unit increase
in X. Reporting a value of h requires that both units be stated, such as
"systolic blood pressure per year of age."
,..,
6.9-TIte situatioB whell X varies from sample to sample. Often the
investigator does not select the values of X. Instead. he draws a sample
from some population. then measures two characters Y and X for each
member of the sample. In our illustration, the sample is a sample of
apple trees in which the relation between the percentage of wormy fruits
Yon a tree and the size X of its fruit crop is being investigated. In such
applications the investigator realizes that if he drew a second sam-
ple. the values of X in that sample would differ from those in the first
sample. In the results presented in preceding sections, we regarded the
values of X as essentially fixed. The question is sometimes asked: can
these results be used when it is known that the X-values will change from
sample to sample?
Fortunately, the answer is yes, provided that for any value of X the
corresponding Y satisfies the three assumptions stated at the beginning
of section 6.4. For each X, the sample value of Y must be drawn from a
normal popUlation that has mean I' = IX + fJx and constant varianc<: uY'/'
Under these conditions the calculations for fitting the line, the I-test of b,
and t1te methods given later to construct confidence limits for fJ and for
the position of the true line all apply without change.
Consider, for instance, the accuracy with which {J is estimated by b.
The standaTd error of b is (fy .•I.jCr.x'). If a second sample of n apple
trees were to be drawn, we know that E.x', and hence the standard error
of b, would change. That is. when X varies from sample to sample, some
samples of size n provide more accurate estimates of (J than othen. But
since the value of E.x' is known for the sample actually drawn. it makes
sense to attach to b the standard error uy .•I.j(r.x'), or its estimate
s,.. ./.j(r.x'). By doing so we take account of the fact that our b may be
somewhat more -accurate or somewhat less accurate than is usual in a
sample of size n. In statistical theory this approach is sometim~. de-
scribed as using the conditional distribution of b for the values of X that
we obtained in our sample, rather than the general distribution of b in
repeated samples of size n.
There is one important distinction between the two cases. Suppose
that in a study of families, the heights of pairs of adult brothers (X) and
sisters (Y) are measured. An investigator might be)nterested either in
the regression of sister's height on brother's height: .
f ~ Y + b, .•(X - X)
or in the regression of brother's height on sister's height:
g=X+b... ,(Y- Y)
These two regression lines are diff~rl?Tlt. For a sample of II pairs of
brothers and sisten, they are shown in figure 7.1.1 (p. 173). The line AD
in this figure is the regression of Yon X, whikthe line CD i. the regression
of X on Y. Since b,.• = E.xyjf.x' and b•. , = 'E.xy/"E.T. it follows that
br , is not in general equal to I/bn . as it would have to be to make the
slopes AD and CD identical.
'50 Chapter 6: R.greaioft
If the sample of pairs (X, Y) is a random one, the investigator may
use whichever regression h relevant for his purpose. In predicting
brother's heights from sister's heights, for inslance, he uses the regression
of X on Y. If, however, he has deliberately selected his sample of values
of one of the variates, say X, then only the regression of Yon X has mean-
ing and stability. There are many reasons for selecting the values of X.
The levels of X may represent different amounts of a drug to be applied
to groups of animalS, or persons of ages 25, 30, 35, 40, 45, selected for
convenience in calculating and graphing the regression of Yon age. or a
deliberate choice of extremes, so as to make I:x' large and decrease the
standard error of b, 11,..J.j(I:x'). Provide,l. that the X are selected with-
out seeing the corresponding Yvalues, the hnear regression line of Yon X
is not distorted. Selection of the Yvalues. on the other hand, can greatly
change this regression. Clea,rly, if we choose Yvalues that are all equal,
the sample regression b of Y on X will be zero whatever the slope of the
population regression.
To turn to the nllmerical example, it contains another feature of
interest, a regression that is negative instead of positive.

TABLE 6.9.1
REGRESSION Of Pb.CENTAGE Of WORMY FRUIT ON SIZE Of ApPl E c..Of>
===========================.==
Size of Crop
on Tree Percentage of Estimate of Deviation from
Tree (hundreds of fruits) Wormy Fruits Regression
Number X Y if y- r=dy.~
--
I
.. ---------~-----
8 59 56.14 2.86
2 6 sa sa.17 -0.17
3' II 56 53.10 2.90
4 22 53 41.96 11.04
5 14 50 50.06 -0.06
6 17 45 47.03 -2.03
7 18 43 46.01 -3.01
8 24 '42 39.94 2.06
9 19 39 45.00 -6.00
10 23 38 40.95 -2.95
II 26 30 37.91 -7.91
12 40 27 23.13 3.27

I:X - 228 I:Y - S40


X - 19 'I' - 45
I:X' - 5,256 I: Y' - 25.522 I:XY - 9,324
(I:X)'ln = 4,332 (I: Y)' In - 24,300 (IX)(I:y)ln _ 10,260

I:x1 = 924 I:y' = 1,222 Ixy _ -936


b :; l:xy~x2 = -,93619204 = - t.013 pc=r cent per 100 wormy fruits
f - Y + b(K - X) = 45 - l.OI3(X - 19) = 64.247 - l.013X
ttl,.; = 1,222 - (-936)'/924 = 273.88
s,.; = I:d, ..'/(n - 2) = 273.88/10 - 27.388
---------- -------------------
151
It is generally thought that the percentage of fruits attacked by codling
moth larvae is greater on apple trees bearing a small crop. Apparently the
density of the flying moth tends towards uniformity, so that the chance of
attack for any particular fruit is augmented if there are few fruits in the
tree. The data in table 6.9.1 are adapted from the results of an experiment
(3) containing evidence about this phenomenon. The 12 trees were all
given a calyx spray oflead arsenate followed by 5 cover sprays made up of
3 pounds of manganese arsenate and 1 quart of fish oil per 100 gallons.
There is a decided tendency. emphasized in figure 6.9.1, for the percentage
of wormy fruits to decrease as the number of apples in the tree increases.
In this particular group of trees, the relation of the two variates is even
closer than usual.

~• •
"'-
~ ", .

. ~

"~ 1'-.

o
o 10 '2.0 ~ ~
Vll:L.D (Huru:i~C'd~ of' f"rvlt~)

FIG. 6.9.I~Sample regression of percentage of wormy fruits on size or crop in apple


trees. The cross indicates the origin for deviations,O'(.\'. flo

The new feature in the calculations is the majority of negative prod-


ucts. xY. caused by the tendency of small values of Y to be associated
with large values of X. The sample regression coefficient shows that the
estimated percentage of wormy apples decreases. as indicated by the minus
sign. 1.013 with each increase of 100 fruits in the crop. The sample regres-
sion line, and of course the percentage, falls away from the point. O'(X, f).
by 1.013 for each unit of crop above 19 hundreds.
The regression line brings into prominence the deviations from this
moving average, deviations which measure the failure of crop size to ac-
count for variation in the intensity of infestation. Trees number 4. 9. and
11 had notably discrepant percentages of injured fruits, while numbers 2
152 Chapter 6: R."._ioft
and 5 performed as expected. According to th~ modelth= are random
deviations from the average (regr_ion) values, but close observation of
the trees during the flight of the moths might reveal some characteristics of
this phenomenon. Tree 4 might have been on the side from which tbe
flight originated or perhaps its shape or situation caused poor applications
ofth. spray. Trees 9 and II might have had some peculiariti.,; of con for-
mation of foliage that protected tbem. Careful s(udy of trees 2 and 5
might tbrow light on the kind of tree or location that receives normal in-
festation. This kind of case studl' usually does not affect the handling of
the sample statistics, but it may add to tbe investigator's knowledge of his
experimental material and may afford clues to the improvement of future
experiments.
Among attitudes toward experimental data, two extremes exist. both
of which should be avoided: some attend only to minute detail, of sample
variatiun. neglecting the summarization of the data and the consequent
inferences about the population; others are impatient of the data them-
selves, rushing headlong toward averages and other generalizations.
Either course fails to yield full informatiun from the experiment. The
competent investigator takes time to examine each datum together with
the individual measured. He attempts to distinguish normal variation
from aberrant observations. He then appraise, his summary statistics
and his population inferences and draws his conclusions against this back-
ground Of sample facts.
EXAMPLE 6.9.1- Another group of 12 trees. investigated by Hansberry and Richard.
son. was sprayed with lead al'JC'nate thro1l8bout the ~son< In addition. the fourth and fifth
cover sprdys contained l~/~ mineral oil emulsion lnd nicotine sulfate at the rate of I pint per
100 gallons. The results are shown below. These facts may be verified: I:X= 240. I:Y
= 384, };x 2 = -808. 1:)"2 = 1.-428• .I;x)' = - 582. regression cOefficient = -0.7203. f = 46.41
-O.7203X. Y - f for theftrst tree: = 16.40";',.

Size of Crop, X Hund~s I~, 15, 12. 26, 18. 12, 8, 38. 26, 19. 29, 22

Pertentage Wormy, Y 52. 46. 38. 37, 37, 37, 34, 25. 22. 22. 20. t4

"'
EXAMPLE 6.9.2-ln table 6.9.1. calculate1:.d~,J<2 = 273.88 by means of tbe fonnula
given in section 6.2.
EXAMPLE 6.9.3~The following weiahU of body and comb of lS~y.-okl White
Lqhorn male chicks are adapted from Soedocor and Breneman (4):

Chick Number 2345678910

Body weight (grams). X ~ n ~ 00 % M " 91 ~ m


Comb weight (milligrams), Y 56 42 t8 84 Sf> t07 90 68 31 48

Calculate the sample regressIOn equation. f = 60 + 2.302 (X - 83).


EXAMPLE 6.9.4--Construct the graph of the chick data. plotting body weight along
Ihe honzontaJ axis. Jn~rt the regression line.
153
6.10-lnterYal estimal.. of Pand tests of null hypotheses. Being pro-
vided with point estimates of the parameters of the regression population,
we turn to their interval estimates and to tests of hypotheses about them.
First in order of utility, there is the sample regression coefficient b,
an estimate of {J. As seen in section 6.2, in random sampling, b is dis-
tributed with a variance estimated by
S'2
b
= S
y"%
2~X'
{-

Thus, in the apple sampling of table 6.9.1,


so' = 27.3881924 ~ 0.0296; s, = 0.172%
Moreover, since the quantity (b - {J)/s, follows the I-distribution with
n - 2 degrees of freedom, it may be said with 95% contidence that

b - lo.o,s, OS; {J s b + 10 .0 ,S,


For the apples, d.f = 10, 10 .0 , = 2.228, 10 .o,S, = (2.228)(0.172) = 0.383,
b - lo.o,s, ~ -1.013 - 0.383 = -1.396 per cent per 100 fruits.
b + 10.o,S, = -1.013 + 0.383 = -0.630 per cent per 100 fruits,
and, finally,
- 1.396 s {J s - 0.630
Ifit is said that the popUlation regression coefficient is within these limits.
the slatemenl is right unless the sample is one of the divergenl kind Ihat
occurs about once in 20 trials.
Instead of the interval estimate of fI, interest may lie in testing some
null hypothesis. While it is now rather obvious that Ho: {J = 0 will be
rejected, we proceed with the illustration; if there were any other pertinent
value of {J to be tested, we could use that instead. Since (b - {J);s, follows
the l-distribution we put
b - {J - 1.013 - 0
/ = ---;,:- = 0.172 = -5.89, df = n - 2 = 10

The sign is ignored because the table contains bothhalves of the distribu-
tion. Ho is rejected. One concludes that in the population sampled there
is a regression of percentage wormy apples on crop size. the value likely
being between -0.630 and -1.396 per cent per 100 fruits.

6.11-Prediction of tile population regression line. Next. we may wish


10make inferences about Ji = " + {Jx, thaI is. aboul the height of Ihe pop-
ulation regression line at the point X. The sample .stnnote of I' is Y= Y
+ bx. The error in the prediction is
f - Ji = Cf' - 21 + (b - {JIX
154 Chopt.,. 6, R"l/t'HIiott
But since Y = 0: + (Ix + e, we have Y = 0: + e, giving
f - I' = f. + (b - II)x (6,11.1)
The term;; has variance (1, • .'/n. Further, b is distributed about p with
variance <1,., '/Ix'- Finally, the independence of the ,'s guarantees that
these two sources of error are uncorrelated. so that the variance of their
sum is the sum of the two variances. This gives

°v ,= ,(I + x' )
(Jy"Jt. -;; i~

The estimated standard error of f is


.1, = s, .• ~O/n) + (x' /Ix') 16.11.2)
with In - 2) df

For the apples, s,... = ~27.388, n = 12, and Ix' = 924.


s, = )27.388)0/12) + (x'/924) '" )2.282' + 0.02964x'
For trees with a high crop like tlla! of Tree 12, x = 21 and s, = 3.92%.
notably greater than sp = I.SI% at x = O. The reason why s, increases
as X recedes from X is evident from the term Ih- PIx in equation (6.11.1).
The etfect of any error in h is steadily magnified as x becomes greater.
Corresponding to any f, the point estimate of 1', there is an interval
estimate
f - to,O~Sy ~ II .$ Y + to.os~'t
One might wish to estimate the mean percentage of wormy apples, 1'. at
the point X = 30 hundreds of fruits. If so,
x = X - X = JO - 19 = II hundreds of fruits
f = Y + bx = 45 - (LOI3}(1l) = 33.86%
10.0,S, = (2.228)j2.282 + (O.02964)(11'j = S.40";'
33.86 - 5.40 .s J' .s
33.86 + 5.40
Finally,
28.46% .s J' .s 39.26~~
At X = 30 hundreds of fruits, the. population mean J' is estimaled as
33.86% wormy fruits with 0.95 confidence limits from 28.46"1" to 39.26%.
This confidence interval is represented by AB in figure 6.11.1.
If oalculations like this are done for various values of X and if the
<onfidence limits are plotted above and below the sample regression line.
o~e has a conftdence bell or lone with curved borders DB and CAin
figure 6.11.1. The curves are the branches of a hyperbola. We have
155
H

'"

>40
~

!
~
e 30
i
t 2

J
OL------~,Oc---'·' ..........,20.,------,i30,.-------4.,O-
Sir. of Crop (nundr.ds of fr",itl)1 X

FJG. 6.J I.I----Confidence belts for fl. ABeD; and for Y. EFGH: the apple data.

confidence that 1', for any X lies in the belt. The figure emphasizes the
increasmg hazard of making predictions at X far removed from X.
6.1Z-Prediction of an indiridual Y. A further use of regression is to
predict the individual value of Y for a new member of the population for
which X has been measured. The predicted value is again Y = Y + bx,
but since Y = ~ + fix + e, the error of the prediction now becomes
f - Y = (1' - ~l + (b - P)x - e
The random element e for the new member is an additional source of un-
certainty. So, the mean square error of the predicted value contains
another term, being
2 2 2
s, - --
2 _Srx
n
+~
S,.x
.. x
+ s,.x
X2

Since the term arising from the variance of e usually dominates, the stan-
dard error is usually written as

.Sy .== s;."


J + -n + '"..x'x
1
I
2 (6.12.1)
156 Chapt.r 6: Regression
It is imponant not to confuse the two types of prediction. If the regres-
sion of weight on height were worked out for a sample of 20-year-old
males, the purpose might be to predict the average weight of 20-year-old
males of a specific height. This is prediction of J1 given X. Alternatively,
We might want to predict the weight of a new male whose height is known.
This is prediction of an individual Y, given X.
The 1"'0 prediction problems hal-e the interesting feature that the pre-
diction, y, is exactly the same in the two problems. but the standard error
of the prediction differs (compare equations [6.11.2] and [6.12.1]). To
avoid confusion. use the symbols [1 and .~ji when a population average is
neing predicted. and f' and Sf when an individual Y is being predicted.
For example. if you wi,h to predict the percentage of wormy apples on a
tree yielding 30 hundreds of fruits.

'O.O,S, = 2.228 y'27.388 ~i12 + (11)'/924 = 12.85;',


From Ihis and Y = 33.86%, the confidence interval is given by
33.86 - 12.85 $ Y $ 33.86 + 12.85
or.
21.01'\, $ Y $ 46.71%.
as shown by EF, figure 6.11.1. We conclude that for trees hearing 3.000
fruits, population values of percentage wormy fruits fall between 21.01%
and 46.71 '10 unless a l-in-20 chance has occurred in the sampling.
Conlinlllllg this procedure. a confidence belt HF and GE for Y may
be plolled as in the figure. It is to be observed that all the sample points
lie in the belt. In general about 5% of them are expected to fall outside.
Unfortunately. the meaning of this confidence band is apt to be mis-
und<fstood. Suppose that we construct 95% confidence intervals for the
Y values of a large number of new individual specimens thai all have the
same value of X. The 95% confidence probability is correct ifror each new
specimen we draw a new sample of values of ( Y. X), compute a new sample
regression line and value of -"Y';'" and construct a new confidence interval
from these data. If we make a large number of confidence intervai state-
ments from the same sample regression line, the proponion of these state-
ments that is correct is not 95% for a specific line, but may be more or less.
If the sample from which the regression line was computed happens to
give an unusually low value' of SY'x. so that the confidence band is nar-
rower than usual. less than 95~{, of the confidence interval statements is
likely to be correct.
This point can be illustrated from the line constructed in table 6.4.1
(p.143) as an example of the regression model. The sample line is
1..\4 _. O.468X. and has 3 value 2.376 at X = 2. Further. s,
at X = 2 is
found to be 1.325. and /0.0" for 4 d/., is 2.776. Hence. the 95% con-
fidence limits for an individual Yat X = 2 are 2.376 ± (2.776)(L325),
giving - I.J02 and 6.054·
15T
But we know from the population model that any new Y at X = 2
is normally distributed with p. = 5 and (f = I. The probability that this
Y lies between 0.948 and 8.484 is easily calculated from the normal table.
I! is practically 100%, instead of95%. In fact, with this sample line, the
95% confidence probability statements are conservative in this way at all
six values of X.
The worker who makes many predictions from the same sample line
naturally wants some kind of probability statement that applies to his
line. The available techniques are described by Acton (11).
EXAMPLE 6.12.1-10 the ~gression of comb weight of chicks on body weight. t:x-
ample 6.9.3, n === 10. X = 83 gms., Y = 60 mg., l:x 2 = 1,000. I:y2 = 6.854 end I:xy =. 2.302.
Set 95% confidence limits on «, assuming the same set of body weights. Ans.49.8 - 70.2 mg.
EXAMPLE 6.t2.2-lo the chick data, b ~ 2.302. Test the hypothesis that p ~ O.
Ans. t = 5.22, P < 0.01.
EXAMPLE 6.12.3-Since evidently there is a population regr~sion of comb weight
on body weight. set 95% limits to the regression coefficient. Ans. 1.28 - 3.32 mg. per gm.
EXAMPLE 6.12.4-Predict the population average comb weight of l00-gm. chicks.
Ans. 99.1 mg. with 95% limits. 79.0 - 119.2 mg.
EXAMPLE 6.12.5-Set 95% confidence limits to the forecast of the comb weight 0(' a
randomly chosen IOO-gm. chick. Ans. 61.3 - 136.9 mg.
EXAMPLE 6. 12.6~In the Indianapolis motor races (example 6.3.2),estimate the speed
for the year 1946. for which the coded X is 35. and give 95% limits, remembering that in-
dividual speeds are being estimated. Ans. 122.3 miles per hour with 95% limits
118.9 -125.7. The actual speed in 1946 was 114.8 miles per hour, lying outside the limits.
The regression formula overestimated the speeds consistently in the ten years following 1945.
EXAMPLE 6.12.7----Construct 80,?~ confidence bands for the individual race results in
the p.;:riod 1911-1941. Since there were 29 races, you shouJd find about 6 results lying out-
side the band.
EXAMPLE 6.12.8-In time series such as these races, the assumption that the t are
independent of each other may not hold. Winning of successive races by the same man,
type of car, or racing technique. all raise doubts on this point. If the t are not independent,
Y and b remain unbiased estimates of IX and p. but they are no longer the most precise
estimates. and the formulas for standard errors and confidence limits become incorrect.

6.13-Testing a deviation that looks suspiciously large. When Y is


plotted against X, one or two points sometimes look as if they lie far from
the regression line. When the line has been computed, we can examine
this question further by drawing the line and looking at the deviations for
these points. or by calculating the values of d, .• for them.
In this process one needs some guidance with respect to the question:
When is a deviation large enough to excite suspicion? A test of signifi-
cance is carried out as follows:
I. Select the point with the largest d, .• (in absolute value). As an
illustration, we use the regression of wormy fruit on size of apple crop,
table 6.9.1 and figure 6.9.1, p. 151. We have already commented that for
tree 4. with X = 22. Y = 53. the deviation d, .• = 11.04 looks large.
158 CItopte, 6: R_ession
2. Recompute the regression wIth this point omitted. This requires
little work. since from the values rx. t Y. LX'. L Y'. and rx Y. we simply
subtract the contribution for tree 4. We find for the remaining n - I = II
points:
x= 18.73 : LX' = 914
ji = 44.27 - 1.053x : s,,/ = 15.50. with 9 df.
3. For the suspect. x = 22 - 18.73 = 3.27. r = 44.27 - (1.053113.27)
= 40.83. Y = 53.
4. Since the suspect was not used in computing this line. we can re-
gard it as a new member of the populalion. and lesl whether ils deviation
from the line is within sampling error. We have Y - Y= 53 - 40.83
= IZ.IV. Since formula 6.IZ.1 is applicable to the reduced sample of size
III - I). the variance due to sampling errors is
'\
sr_,' = ",.! 0+ n ~ 1 + ;xiJ
.

= (15.50) (1 + 1\ + (3~~~') = (15.50)(1.1026) = 17.09


The value of 1 is
Y- Y 12.17
t = - Sr_ f' = -/1709
.
= 2.943.

with 9 d.f The 2% level of I is 2.821 and the 1% level is 3.250. By in·
terpolation. Pis aboul 0.019.
As it stands. however. this I-lest does not apply. because the t~t
assumes that the new member is randomly drawn. Instead. we selecltd
it because it gave the largest deviation of tile 12 poinls. If P is the proo-
abiltty that 1 for a random deviation exceeds some value ' 0 • then for small
values of P the probability thatlm~ (computed for the largest of n devia-
tionsl exceeds 'ois roughly nP. Consequently. the significance probability
for our Hest is approximately (12)(0.019) = 0.23. and Ihe null hypothesis
is not rejected.
When the null hypothesis is rejected. Ihis indicates an inquiry to see
whether there were any circumstances peculiar to Ihis point. or any error
of measurement or recording. that caused the large deviation. In some
cases an error is unearthed and corrected. In others, some extraneous
causal factor that made the point aberrant is discovered. although Ihe
fault cannot be corrected. In this event. the point should be omitted in the
line that is to be reported and used, provided that the causal factor "
known to affect only Ihis point. When no explanation is found the situa-
tion is perplexing. 11 is usually best to ex. mine the conclusions obtained
with the suspect (i I inCluded, (iil excluded. If these conclusions din·.r
materially, as they sometimes ,il'. it is well to note that either may be
correct.
159
6.14-Prediction of X from Y. Linear calibration. In some applica-
tions the regression line is used to predict X and Y, but is constructed by
measuring Yat selected values of X. In this event, as pointed out in the
discussion in section 6.9 (p. 150), the prediction must be made from the
regression of Yon X. For example, X may be the concentration of some
element (e.g., boron or iron) in a liquid or in plant fiber and Ya quick
chemical or photometric measurement that is linearly related to X. The
investigator makes up a series of specimens with known amounts of X
and measures Y for each specimen. From these data, the calibration
curve, the linear regression of Yon X, is computed. Having measured Y
for a new specimen, the estimate of x = X - X is

~=(Y-Y)lb
Confidence limits for x and X are obtained from the method in sec-
tion 6.12 hy which we obtained confidence limits for Y given x. As an
illustration we cite the example of sections 6.11-n.12 in which Y= per-
centage of wormy fruits; X = size of crop (though with these data we
would in practice use the regression of X on Y, since both regressions are
meaningful).
We shall find 95% confidence limits for the size of crop in a new tree
with 40 per cent of wormy fruit. Turn to figure 6.11.1 (p. 155). Draw a
borizontalline at Y = 40. The two confidence limits are the values of X
at the points where this line meets the confidence curves GE and HF.
Our eye readings were X = 12 and X = 38. The point estimate g of X is,
of course, the value of X, 24, at which the horizontal line meets the fitted
regression line.
For a numerical solution, the fitted line is Y + bx, where Y = 45,
b = -1.013. Hence the value of x when Y = 40 is estimated as
~ = (Y - Y)lb = - (40-45)/1.013 = 4.936: g = 23.9 hundreds
To find the 95% confidence limits for x we start with tbe confidence
limits of Y given x:

Y = Y + bx ± to,·x J 1 x,
1 + II + I:... (6.14.1)

where :E denotes :Ex 2 and t is the 5% level for (n - 2) d/. Expression


(6.14.1) is solved as a quadratic equation in x for given Y. After some
manipulation the two roots can be. expressed in the following form, which
appears the easiest for numerical work;

.." +-- + I)
ts"xJ (n--- (1 -c ') ~2
+-
- b n :E
X= Z
(6.14.2)
1- C

where
160 Chapler 6: Regress ....

c = e;( = KSb'x Y
2

In this example n = 12, I = 2.228 (I0df.), s,.. = 5.233, 1: = 924, b = -LOB,


i = 4.936. These give
es". = (2.228)(5.233) = _I .509' '= (11.509)2 =
01434
b - 1.013 I,c 924,'

From (6.14.2) the limits for x are

x = ± (1I.509)J{(1.0833)(0.8566) + 0.0264)
4.936
This gives - 7.4 and + IS.Ii for x or 11.6 and 37.9 for X, in close agreement
with the graphical estimate.
The quantity c = Is,,/b is related to the test of significance of b. If
b is significant at the 5% level, b/s, > t, so that c < I and hence c' < I. If
b is not significant, the denominator in equation (6.14.2) becomes negative,
and finite confidence limits cannot be found by this approach. If c is small
(b highly significant), c' is negligible and the limits become

x
±T
IS,·x J i
I +;;+1:x'
x,
These are of the form i ± IS" where s, denotes the factor that multiplies
I. In large samples, s, can be shown to be the estimated standard error of
i, as this result suggests.
In practice, Y is sometimes the average of m independent measure-
ments on the new sp<:Cimen. The number I under the square root sign
in(6.14.1) then becomes 11m.
6.IS-Partitioning the sum of squares of the dependent variate. Re-
gression computations may be looked upon as a process of partitioning
1: y' into 3 parts which are both useful and meaningful. You have become
accustomed to dividing 1: y' into (1: Y)' In and the remainder, 1:y'; then
subdividing 1:y' into (1:xy)'jI:x' and I:d,.j. This means that you have
divided 1: y' into three portions:
1: y' = (1: Y)' In + (1:xy)' jI:x' + 1:d,.)
Each of these portions can be associated exactly with the sum of squares of
a segment of the ordinates, Y. To illustrate this a simple set of data has
becn set up in table 6.15.1 and'graphed in figure 6.15.1.
In the figure the ordinate at X = 12 is partitioned into 3 segments:
Y= Y + J + d, .•.
where V = Y - Y = bx IS the deViation of the pomt Y on the filled line
from Y Each of the other ordinates may be divided similarly, though
161
TABLE 6.15.1
DATA SET UP TO ILLUSTRATE 1lfE PARTITION OF Iyl

x 2 4 6 8 10 12 14 :EX- 56

Y 4 2 5 9 3 II 8 :EY = 42

n=7, X=8, Y=6,l:x 2 =112. I:yl=68, Ixy==56

negative segments make the geometry less obvious. The lengths are all set
out in table 6.15.2 and the several segments are emphasized in figure 6.15.1.
Observe that in each line of the table (including the two at the bottom)
the sum of the last three numbers is equal to the number in column Y.
Corresponding to the relation

Y= y + ~ + d, ...
we have the following identity in the sums of squares

1: y2 = 1: y2 + 1:92 + 1:dy . ; ,
ea~h of the three product terms being zero. The sums of squares of the
ordinates, 1: y2 = 320, and of the deviations from regression, Di,.,2 = 40,

10

ReOfltsion. Y. 6+0.5 (x-a)


a

.4

o 0~----~2----~4~----~6------8~----~IO~--~1~2----~'4~--X

F1O, 6.15.1-Graph of data in table 6.15.L The ordinate at X = 12 is shown divided into
r
2 parts, Y = 6 and y = 5. Tbrn y is subdivided into .9 = 2 and d,." =e 3. Thus Y = Y +
+dr .=6+2+3=11.
162 Cbapfer 6: Regreaioft
TABLE 6.15.2
LENGTHS OF ORDINATES IN TABLE 6.15.1 TooElllER WITH
SEGMENTS INTO WHICH THEY ARE PARTITIONED

Deviation From
Pair Number Ordinate Meall Deviation Regression
y l' P d,."
1 4 6 -3 1
2 2 6 -2 -2
3 5 6 -I 0
4 9 6 0 3
5 3 6 1 -4
6 II 6 2 3
7 8 6 3 -I

Sum .42 42 0 0
Sum ofsquares 320 252 28 40

are already familiar. It remains to identify (E Y)2 In with :E Y' and


(l;xy)' /Ex' with E~' First, ~--~
--
(:E Y)' = (nY)' = nY' = :E Y'
n n
That is, the correction for the mean is simply the sum of squares of the
mean taken n times. Second,

(:Exy)' = (:Exy)2. :Ex' = b':Ex' = :Eb'x' = :E~'


:Ex' (:Ex')'
So the sum of squares attributable to the regression turns out to be the
sum of squares of the de"iations of the points j)on the lilted line from their
mean.
The vanishing of the cross-product terms is easily verified by the
method used in section 6.6.
Corresponding to the partition of :E y' there is a partition of the
TABLE 6.15.3
ANALYSIS OF VARIANCE OF Y IN TABLE 6.15.1

Description of Degrees of Mean


Source of Variation Symbol Freedom Sum of Squares Square

The mean Y 1 (l: y)' In ~ 252


Regression b 1 (I:xy)l~X2 = 28
Deviation from regression d"1{ 11-2=5 ,
l:.d,.;r; ~ 40 s,." 1 =8
Total Y "=7 1: yl = 320

l:y' = 28 + 40 _ 68, d,f. =n - 1 =6


163
total degrees of freedom into three parts. Both partitions are shown in
table 6.15.3. The n = 7 observations contribute 7 degrees of freedom, of
which I is associated with the mean and I with the slope b of the regression
coefficient, leaving 5 for the deviations from regression. In most applica-
tions the first line in this table is omitted as being of no interest, the break-
down taking the form presented in table 6.15.4.
TABLE 6.15.4
ANALYstS OF V AllIANCE OF Y IN TABLE 6.15.1

Degrees of SumoC Mean


Source of Variation Freedom Squares Square

Regression I 28
Deviations from regression 5 40 8

Deviations from mean 6 68 11.3

Table 6.15.4 is an analysis of variance table. In addition to providing


a neat summary of calculations about variability, it proves of great utility
when we come to study curved regressions and comparisons among more
than two means. The present section is merely an introduction to the
technique, one of the major contributions of R. A. Fisher (5).
EXAMPLE 6.lS.i-Dawes (6) determined the "density" of the melanin content of ~h:!
skin of 24 male frogs together with their weights. Since "Some of the 24 males ... were
selected for extreme duskiness or pallor so as to provide a measure of the extent of variabil-
ity," that is, since selection Wll$ exerciSed on density this variate must be taken as X.

Density. X 0.13 0.t5 0.28 0.58 0.68 0.31 0.35 0.51<


Weight, Y 13 t8 18 18 18 t9 21 22

Density. X 0.03 0.69 0.38 0.54 1.00 0.73 0.77 0.82


Weigh~Y 22 24 25 25 25 27 27 27

Density. X 1.29 0.70 0.3. 0.54 1.08 0.86 0.40 1.67


Weight, Y 28 29 30 30 35 37 39 42

Calcwate X ... 0.6225 units. Y ,.. 25.79 grams, ~X2 = 3.3276.l:y2 = 1,211.96, I:xy = 40.01.2.
EXAMPLE 6.tS.2-In example 6.tS.t test the hypOthesis. ~ = O. Ans. 1 = 3.81, P
< 0.01.
EXAMPLE 6. 15.3-Analyze ~be variance of the frog weights. as follows:

Degrees of Sum of
Source of Variation Freedom Squares MeanSq......

Mean I 15,%5.04
Regression I 481.36
De\i&tlons 22 730.60 33.21

Tota! 24 n,171.00

11
164 Chapte' 6: Regre..ioft
EXAMPLE 6.15.4··--How nearly free from error is the measurement of melanin
density, X? After preparation of a solution from the skin of the fwgs. the intensity of the
color was evaluated in a colorimeter and the readings then transferred graphically into
neutral denslties. The figures reported are means affrom 3 to 6 determinations. The error
of this kind of measurement is usually appreciable. This makes the estimate of regression
biased dow~wards. Had not the investigator wished to learn about extremes of density,
the regression of density on weight might have been not only unbiased but more informative.

6.16-Galton's use of the term "regression." In his studies of in-


heritance Galton developed the idea of regression. Of the "law of uni-
versal regression" (7) he said. "Ea~h peculiarity in a man is shared by his
kinsman, but on the average in a less degree." His friend, Karl Pearson
(8). collected more than a thousand records of heights of members of
family groups. Figure 6.16.1 shows his regression of son's height on

"
,.
"
V
0

~ ,. r- I .)V'
I
V
%
o
;;;

.•••• / I
%

I
Vi
o i
• I i
I

0 Y ---- - - -

.. V "/
6. •• •• ••
I
FATHER S
•• ,•
HE1GHT (inch.,)
, .
,
FIG. 6. 16. I-Rcgre$sion orson's stature on father's (8). Y= O.516X + 33.73.
1,078 families.

father's. Though tall fathers do tend to have tall sons, yet the average
height of sons of a group of tall fathers is less than theii father's height.
There is a regression, or going back, of son's heights toward the average
height of all men, as evidenced by the regression coefficient, 0.516, sub-
stantially less than I.
6.17-Regression when X is subject to error. Thus far we have as-
sumed that the X-variable in regression is measured without error. Since
no measuring instrument is perfect, this assumption is often unrealistic.
A more realistic model is one that assumes Y = 11. + P(X - X) + e as be-
fore, but regards X as an unknown true value. Our measurement of X is
X' = X + e, where e is the error of measurement. For any specimen we
know (Y, X') but not X.
If the measurement is unbiased, e, like e, is a random variable follow-
165
ing a distribution with mean O. The errors e may arise from several
sources. For instance, if X is the average price of a commodity or the
average family income in a region of a country, this is usually estimated
from a sample of shops or of families, so that X' is subject to a sampling
error. With some concepts like "educational level" or "economic status"
there may be no fully satisfactory method of measurement, so that e may
represent in part measurement of the wrong concept.
If e, e, and the true X are all normally and independently distributed
it is known that Yand X' follow a bivariate normal distribution (section
7.4.). The regression of Yon X' is linear, with regression coefficient
P' = PI( I + ).),
where). = (1/I(1x • (If Xis not normal, this result holds in large samples
2

and approximately in small samples if), is small.) Thus, with errors in


X, the sample regression coefficient. b , of Yon X' no longer provides an
unbiased estimate of p, but of PI(l + ").
If the principal objective is to estimate p, often called the structural
regression coefficient, the extent of this distortion downwards is de-
termined by the ratio;' = (1/1(1/. Sometimes it is possible to obtain an
~stimate S.2 of (1/. Sin~ O'X,2 = ax 2 + (1/, an estimate of A is
). = s/I(sx 2 - s/). From). we can judge whether the downward bias is
negligible or not, If it is not negligible, the revised estimate b'(1 + 1)
should remove most of the bias.
In laboratory experimentation, ). is often small even with a measuring
instrument that is not highly accurate. For example, suppose that (1x
= 20, Ilx = 100, so that nearly all the values of the true X's lie between 50
and 150. Consider (1, = 3. This implies that about half of the true X'sare
measured with an error greater than 2 and about one third of thel1) with
an error greater than 3-a rather imprecise standard of performance.
Nevertheless, 2 is only 9/400 = 0.022.
If the objective is to predict the population regression line or the
value of an individual Y from the sample of values (Y, X'), the methods
of sections 6.11 and 6.12 may still be used, with X' in place of X, provided
that X, e, and e are approximately normal. The pre8.lence of errors in X
decreases the accuracy of the predictions, because Ihe residual variance
is increased, though to a minor extent if A is small. The relation between
(1Y'x· 2 and (1y.x' may be put in two equivalent forms:
(1/ - ay .•. 2 = (a/ - a,'x 2 )1(1 + 2), (6.17.1)
or,
O'y·x
2_
- tlr·x
2 ;. - (Or
+ -- 2- O'y·x
2) (6.17.2)
. (I + 1)
Berkson (10) has pointed out an exception to the above analysis.
Many laboratory experiments are conducted by setting X' at a series of
fixed values. For instance, a voltage may be set at a series of prede·
166 Cltapter 6: Revreuion
termined levels X,', X2" ... on a voltmeter. Owing to errors in the volt-
meter or other defects in the apparatus, the true voltages X" X 2 , •••
differ from the set voltages.
In this situation we still have Y = " + PX + E, X' = X + e. In both
our original case (X normal) and in Berkson's case (X' fixed) it follows
that
Y= " + PX' + (E - pel (6.17.3)
The difference is this. In our case, e and X' are correlated because of the
relation X' = X" e. Consequently, the residual (£ - pel is correlated
with X' and does not have a mean zero for fixed X'. This vitiates Assump-
tion 2 of the basic model (section 6.4). With X' fixed, however, e is
correlated with X but not with X', and the model (6.17.3) satisfies the
assimiptions for a linear regression. The important practical conclusion
is that b', the regression of Yon X', remains an unbiased estimate of p.
6.18-Fittiog a straight line througb the origin. From some data the
nature of the variable Yand X makes it clear that when X = 0, Y must be
O. If a straight li'lle regression appears to be a satisfactory fit, we have the
relation
Y = pX + s
where, in the simplest situations, the residual E follows %(0, ( 2 ). The least
squares estimate of Pis b = I:XYj1:X' The residual mean square is
s,.x' = p: yl - (I:Xy)2/I:X2 }/(n _ I)
with (n - I) df Confidence limits for pare
b ± ISh'
where t is read from the !-table with (n - I) df and the appropriate
probability.
This model should not be adopted without careful inspection of the
data, since complications can arise. If the sample values of X are all some
distance from zero, plotting may show that a straight line through the
origin is a poor fit, although a straight line that is not forced to go through
the origin seems adequate. The explanation may be that the population
relation between Yand X is curved, the curvature being marked near zero
but slight in the range within which X has been measured. A straight line
of the form (a + bx) will then be a good approximation within the sample
range, though untrustworthy for extrapolation. If the mathematical form
of the curved relation is known, it may be fitted by methods outlined in
chapter 15.
It is sometimes useful to test the null hypothesis that the line, as-
sumed straight, goes through the origin. The first step is to fit the usual
two-parameter line (oc + px), i.e., " + P(X - X), by the methods given
earlier in this chapter. The condition that the population line goes
167
through .!_he origin is at - fiX = O. The sample estimate of this quantity
is Y - bX, with estimated variance
Sy./ (lin + X2jI:x 2)
Hence, the value of t for the test of significance is
f - bX
(6.18.1)
t = s"x VI {lin + X2 jI:x 2)
with (n - 2) dj. This test is a particular case of the technique presented
in section 6.11 for finding confidence limits for the population mean value
of Y corresponding to a given value of X.
The following example comes from a study (9) of the forces necessary
to draw plows at the speeds commonly attained by tractors. Those results
of the regression calculations that are needed are shown under table 6.18.1.
TABLE 6.18.1
DRAFT ANn SPIm OF PLows 0ItAWN BY .TllACTORS

Draft (Ibs.) Y 425 420 480 495 540 530 590 610 690 680
Speed (m.p.h.) X 0.9 1.3 2.0 2.7 3.4 3.4 4.1 5.2 5.5 6.0

x - 3.45 m.p.h. Y = 546 Ibs. " = 10


1:.' = 27.985 1:y' = 82,490 1:.y = 1,492.0
b = 53.31 !bs. per mile
5"Jl2 = 368.1 with 8 dj.

One might suggest that the line should go through the origin, since
when the plow is not moving there is no draft. However, inspection of
table 6.18.1, or a plot of the points. makes it clear that when the line is
extrapolated to X = 0, the predicted Y is well above 0, as would be
expected since inertia must be overcome to get the plow moving. From
(6.18.1) we have

t = J--;-;o[~546_---,({.,--53_.34-,)"-(3_.4-,5)",,,2}=] = ~;~ = 26.0


(368.1) 110 + ~~~is ,
with 8 dj., confirming that the line does not go through the origin.
When the line is straight and passes through (0, 0), the variance of
the residual e is sometimes not constant, but increases as X moves away
from zero. On plotting, the points lie close to the line when X is small but
diverge from it as X increases. The extension of the method of least
squares to this case gives the estimate b = I: wxXYjl:w xX 2 , where Wx is the
reciprocal of the variance of e at the value of X in question.
If numerous observations of Y have been made at each selected X,
the variance of e can be estimated directly for each X and the form of the
168 CMp'er 6: 118""laian
functlons Wx d~termined empirically. If there are not enough data to use
this method, simple functions that seem reasonable are employed. A
common one when all X's are positive is to assume that the variance of e
is proportional to X, so that Wx = k(X, where k is a constant. This gives
the simple estimate b = I: Y(I:X = fiX. The weighted mean square of
the residuals from the fitted .line is
s,./ = {I:(Yz(X) - (I:Y)z(I:X}(n - I)
and the estimated standard error of b is s"xl.jr.X.

TABLE 6.18.2
l\IUMBER OF ACIlES IN CoRN ON 25 FARMS IN SoUTH DAKOTA (1942)
SELECTED BY FARM SIZE

Size of Farm Acres in Standard


(acres) Com Deviation Ratio Ratio
X Y Range ',IX YIX
"
80 25 0.312
10 .125
20 .250
32 .400
20 22 8.05 0.101 .250

160 60 0.375
35 .219
20 .125
45 .281
40 40 14.58 0.091 .250
._
240 65 0.271
80 .333
65 .271
85
.~
.354
30 55 0.090 .125

320. 70 0.219
110 .344
30 .094
55 .172
60 80 29.15 0.091 .188

400 75 0.188
35 .088
140 .350
90 .225
110 105 39.21 0.098 .275

Mean 56.28 0.243

I:(YIXI
n = 25. b = -- = 0,243 corn acre/farm acre
"
169
Sometimes the standard deviation of e is proportional to X, so that
Wx = k/X'. This leads to the least squares estimate
b = 1:(XY/X')jl:(X'/X') = 1:( Y/X)/n,
in other words, the mean of the individual ratios Y/X. This model is
illustrated by the data in table 6.18.2, taken from a farm survey in eastern
South Dakota in 1942, in which the size of the farm X and the number of
acres in corn Y were measured. Five of the commoner farm sizes: 80,
160,240,320, and 400 acres, were drawn. For each size, five farm records
were drawn at random.
The ranges of the several groups of Y indicate that G is Increasing
with X. The same thing is shown in figure 6.18.1. To get more detailed
information, Sy was calculated for each group, then the ratio of Sy to X.
These ratios are so nearly constant as to justify the assumption that in Ihe
population G J X is a constant. Also it seems reasonable to suppose that
0(0, 0) is a point on the regression line.
The value of b, 0.243 corn acres per farm acre, is computed in table
6.18.2 as the mean of the ratios Y/X. The sample regression line IS
f=0.243X.

y
100

• •

I •

400 )(
200
"" Numbir of Acres in Form

FIG. 6.18.I-Regression of corn acres on farm acres.


'(70 Chapter 6: Regression
To find the estimated variance of b, first compute the sum of squares
of deviations of the 25 ratios R = Yj X from their means, and divide by
n - I = 24. This gives SR' = 0.008069. Then

S/ 0.008069
so' = -n- = 25 = 0.0003228
s. = 0.0180, df. = n - I = 24.
The 95% interval estimate of Pis set in the usual way,
b - t o.05 Sb :::; fJ ~ b + t o.05 sl1 ,
the result being 0.206 S fJ S 0.280.
In straight lines through the origin the point (X, Y) does not in gen-
eral lie on the fitted line. In the figure, (240, 56.28) falls below the line.
An exception occurs when ".' is proportional to X, giving b = fjX as we
have seen.
6.19-The estimation of ratios. With data in which it is believed that
Y is proportional to X, apart from sampling or experimental error. tlie
investigator is likely to regard his objective as that of estimating the com-
mon ratio Yj X rather than as a problem in regression. If his conjecture
is correct, that is, if Y = PX + e, the three quantities LXYjLX'. L Yj1:X
and L(Y/X)/n are all unbiased estimates of the population ratio p. The
choice among the three is a question of precision. The most precise
estimate is the first, second, or third above according as the variance of e
is constant, proportional to X, or proportional to X'. If the variance of E
is expected to increase moderately as X increases, though the exact rate
is not known, the estimate L Yj1:X usually does well, in addition to being
the simplest of the three.
Before one of these estimates is adopted, always check that Y is
proportional to X by plotting the data and, if necessary, testing the null
hypothesis that the line goes through the origin. Hasty adoption of some
form of ratio estimate may lose the information that Y/ X is not constant
as X varies.
6.20-Summary. The six sample values, n. X, Y, LX', LY', LXY,
furnish all regression information about the population line I' = ~ + px:

1. The regression coefficient of Yon X: b = LXy/LX'. The estimate


of~:a=f
2. The sample regression equation of Yon X: f = Y + bx
3. Yadjusted for X: Adjusted Y = Y - bx
4. The sum of squares attributable to regression:
(1:xy)'j1:x' = 1:y'
· J7J
5. The sum of squares of deviations from regression:
l:y2 _ (l:xy)2jl:x2 = l:d,./
6. The mean square deviation fcom regression:
l:d,./I(n - 2) = Sy./
7. The sample standard error of Y estimated from X:
Sf·.... = sy-JJn
8. The sample standard deviation of the regression coefficient:
s, = s,.,J.jl:x 2
9. The sample standard deviation of r as an estimate of)l = ,,+ fJx:
s, = s,.".JTTn + xijl:x2
10. The sample standard deviation of r as an estimate of a new point
Y:
Sf = s,.dl + lin + x 2jl:x 2
II. The estimated height of the line when X = 0: Y - bX. This is
sometimes called the intercept or the elevation of the line.

REFERENCES
1. P. P. SWANSON, et al. J. Gerontology, 10:41 (1955).
2. J. B. WENTZ and R. T. STEWART. J. Amer. Soc. Agron., 16:534 (1924).
3. T. R. HANSBERRY and C. H. RICHAllOSON. Iowa Stale Coli. J. Sci., 10:27 (1935).
4. G. W. SNEDECOR and W. R. BRENEMAN. Iowa Slate Coli. J. Sci., 19: 33 (1945).
S. R. A. FISHER. Slati$/icai Methods for Re.fearch Workers. Oliver and Boyd, Edin-
burgh (1925).
6. B. DAWES. J. Exp. Biology, 18:26 (1946).
7. F. GALTON. Natural Inheritance. Macmillan, London (1889).
8. K. PEARSON and A. LI:E. Biometriko. 2: 357 (1903).
9. E. V. COLLINS. Tram. Amer. Soc. Agric. Engineers, 14: 164 (1920).
10. J. BERKSON. J. Amer. Statuto Ass., 45: 164 (1950).
11. F. S. ACTON. Ana/y.Jis of StraigJtt LiM Data.. Wiley, New York (1959).
* CHAPTER SEVEN

Correlation

7.I-Introduction. The correlation coefficient is another measure of


the mutual relationship between two variables. Table 7.1.1 and figure
7.1.1 show tl)e heights of II brothers and sisters. drawn from a large
family study by Pearson and Lee (I). Sillce there is no reason to think of
one height as the dependent variable and the other as the independent
variable. the heights are designated X, and X, instead of Yand X. To
find the sample correlation coefficient, denoted by r, compute 1:x, " 1:x,',
and L"x, as in the previous chapter. Then,
r : 1:x,x,!,J {(1:x,')(1:x,')} = 0.558,
as shown under table 7.1.1. Roughly speaking, r is a quantitative expres-
sion of the commonly observed similarity among children of the same
parents-the tendency of the taller sisters to have the taller brothers. In
the figure. the value r = 0.558 reflects the propensity of the dots to lie in
a band extending from lower left to upper right instead of being scattered
randomly Over the whole field. The band is often shaped like an ellipse,
with the major axis sloping upward toward the right when r is positive.
EXAMPLE 7.1.1--Calculate r = I for the following pairs:
X,: t. 2. 3. 4. 5
X,: 3. 5. 7. 9. II

TABLE 7.1.1
STATURE (INCHES) OF BROTHER AND SISTER
(l1Ju~rration taken from Pearson and Lee's sampJe of J,401 families)

Family Number 2 3 4 5 6 7 8 9 to II

Brother, Xi 71 68 66 67 70 71 70 73 72 65 66
Sister. Xl 69 64 65 63 65 62 65 64 66 59 62
- - - - - - - - - - - - - - - - -__ . -.----~

,,=11, XI ,.,,69. Xl = 64, ~XIZ = 74. Ix/ = 66. IX,X2 = 39

172
113

i. • ;'_ D

6 I.
V/
./' ~

j; I
II
2

,/
V V
I
• I( X,-
66 68 10 72 74
BROTHER'S STATURE (inches J

FlO. 7 .1.l~Scatter (or dot) diagram of stature of II brother-sister pairs. r = 0.558.

Represent the data in a graph similar to figure 7.l.1.


EXAMPLE 7.1.2-Verify r - 0.91 io 'he pairs:
X,: 2, 5, 6, 8, 10, 12, 14, 15, 18, 20
X,: I, 2, 2, 3, 2, 4, 3, 4, 4, 5
PIOI the elliptical band of points.
EXAMPLE 7.1.3-ln the following. show that' = 0.20:
X,: 3, 5, 8, II, 12. 12, 17
X,: II, 5, 6, 8, 7. 18, 9
Observe the scatter of the points in a diagram.
EXAMPLE 7.1.4--10 the apple data of table 6.9.1. l:x' - 924. l:y' = 1.222, l:xy
= - 936.
Calculate r = - 0.88

7 2-The sample correlation coefticlent r. The correlation coefficient


is a measure of the degree of closeness of the linear relationship between
two variables.
Two properties of r should be noted:
(i) r is a pure number without units or dimensions, because the scales
of its numerator and denominator are both the products of the scales in
which X, and X, are measured. One useful consequence is that r can be
computed from coded values of X, and X,. No decoding is required.
17.. C"""" 7: c-.laliofo
(ii) r always lies between - I and + I (proved in the next section, 7.3).
Positive values of r indicate a tendency of X, and X, to increase together.
When r is negative, large values of Xl are associated with small values
of X,.
To help you acquire some experience of the nature of r, a number of
simple tables with the corresponding graphs are displayed in figure 7.2.1.
In each of these tables n = 9, Xl = 12, X, = 6, LXt' = 576, LX,' = 144.
Only l:X1X, changes, and with it the value of r. Since ,j(l:xt'Hl:x,')
= ,j(576)(I44) = 288, the correlation is easily evaluated in the several
tables by calculating l:X,X2 and dividing by 288 (or multiplying by
1/288 = 0.0034722 ... if a machine is used).
In A, the nine points lie on a straight line, the condition for r = I.

Jr,
B,p 0986

0 .x, X,
X, o • • 8 12 141622 26 XI 046 8 12 14 16 22 26
7 8 11 13 :. 3 7 • 8 13
XIO 2 3 • 6 XI02
"
X. Jr.
c.,. 0.597 D.'. 0

10 -
• •
• •
0 Xl
• \ • \ ;x,
XI 04'6 8 \2 \4 t6 22 a XI 04 6 8 1214 16 U26
X I 280
•• • 13 7
" .:x, 4 3 II 6 1 13 2 II 0

X"t ][. F.f. -0.889


E,r·-0368

10
• • •

°o~----~~~~~-+
XI 0 4 6 8 12 14 16 22 26 .xl 0 4 12 14 16 22 2:6
Xl 8 7 6 I! 0 2 II 3 4 'Ie II 13 8 4 1 6 3 Z 0

FlG. 7.2.l--Scauer diagrams with correlations ranging from t to -0.889.


175
The line is a "degenerate" ellipse-it has length but no width, The two
variables keep in perfect step, any change in one being accomparned by
a proportionate change in the other, B depicts some deviation from an
exact relationship, the ellipse being long and thin with r slightly reduced
below I, In C, the ellipse widens, then reaches circularity in D where
r = 0, This denotes no relation between the two variables, E and F
show negative correlations tending toward - I, To summarize, .the thin-
ness of the ellipse of points exhibits the magrutude of r, while the inclina-
tion of the axis upward or downward shows its sign, Note that the slope
of the axis is determined by the scales of measurement adopted for the two
axes of the graph and is therefore not a reliable indicator of the magnitude
of r, It is the concentratiOn of the points near the axis of the ellipse that
signifies high correlation,
The larger correlations, either positive or negative, are fairly obvious
from the graphs, It is not so easy to make a visual evaluation if the
absolute value of r is less !han 0,5; even the direction of inclination of the
ellipse may elude you if r is between - 03 and + 03, In these small
samples a single dot can make a lot of difference, In D, for example, if
the point (26, 0) were changed to (26, 9), r would be increased from 0 to
0,505, This emphasizes the fact that sample correlations from a bivariate
population in which the correlation is p are quite variable if n is small, In
assessing the value of r in a table, select soine extreme values of one
variable and note whether they are associated with extreme values of the
other, If no such tendency can be detected, r is likely small,
Perfect correlation (r = 1) rarely occurs in biological data, though
values as high as 0,99 are not unheard 0[, Each field of investigation has
its own range of coefficients, Inherited characteristics such as height ordi-
narily have correlations between 0,35 and 0,55, Among high school
grades r averages· around 0,35 (3), Pearson and Lee got "organic correla-
tions.," that is,correlatioQS between two sucltmeas.urements as stature and
span in the same person, ranging from 0,60 to 0,83, Brandt (2) calculated
the sample correlation, 0.986, between live weight and warm dressed
weight of 533 swine, Evvard ef ai, (6) estimated r = -0,68 between
average daily gain of swine and feed required per pound gained"

7.3-Relatioll between the sample coefficients of correlation and re-


gression. If X, is designated as the dependent variable. its regression co-
efficient on X" say b 21 , is LX,X,/LX, . But if X, is taken as dependent,
its regression coefficient on X, is b 12 = LX,X,/LX/, The two regression
lines are shown in each diagram of figure 7,2, L The two lines are the
same only if r = ± I, as illustrated in A, although they are close together
if r is near ± 1,In the diagrams the regression of X, on X, is always the
line that makes the lesser angle with the vertical axis,
The fact that there are two different regressions is puzzling at first
sight, since in mathematics the equation by which we calculate X, when
given X, is the same as the equation by which X, is calculated when X,
176 Chapler 7: Corr.lation
is given. In correlation and regression problems, however, we are dealing
with relationships that are not followed exactly. For any fixed X, there
is a whole population of values of X,. The regression of X, on X, is the
line that relates the average of these values of X, to X,. Similarly, for
each X, there is a population of values of X" and the regression of X, on
X, shows the locus of the averages of these populations as X, changes.
The two lines answer two different questions, and coincide only if the
populations shrink to their means, so that X, and X 2 have no individual
deviation from the linear relation.
A useful property of r is obtained from the shortcut method of
computmg -'"x 2 in a regression problem. Reverting to Y and X, it will
be recalled from the end of section 6.2 that
Ld,./ = (n - 2)s,./ = Ly2 - (LXy)2/I:x'
Substituting (Lxy)2 = r2Lx 2Ly2, we have
'Ld,./ = (n - 2)s,.x' = (1 - r')1:y' (7.3.1)
Since 'Ldy . / cannot be negative, this equation shows that. must lie be-
tween -1 and + J. Moreover, if r is ± I, 'Ld,./ is zero and the sample
points lie exactly on a line.
The result (7.3.1) provides another way of appraising the closeness of
the relation between two variables. The original sample variance of Y,
when no regression is fitted, is 5, 2 = l:y'/(n - I), while the variance of the
deviations of Y from the linear regression is (I - r')'Ly2(n - 2) as shown
above. Hence, the proportion of the variance of Y that is not associated
with its linear regression on X is estimated by

·,·x' = (n - 1)(1 - r) '" (1 _ r'l


s/ (n - 2)

if n is at all Jarge. Thus r2 may be deseri'bed as the proportion of the


variance of Y that can be attriouted to its (inear regression on X, while
(I - r') is the proportion/ree from X. The quantities r' and (l - r') are
shown in table 7.3.1 for a range of values of r.
TABLE 7.3.1
ESTIMATED PROPORTIONS OF THE VARIANCE OF Y ASSOnATED AND
NOT AssociATED WITH X IN A LINEAJl REGRESSION
-
Proportion Proportion
AsS(X;iatcd No' Associated No,
, r' (I - ,') (I - r')
"
±O.I om 0.99 iO.6 0.36 0.64
±0.2 0.04 0.96 ±O.7 0.49 0.51
±O.3 0.09 0.91 iO.S 0.64 0.36
±0.4 0.16 0.84 iO.9 0.81 0.19
±0.5 0.25 0.75 ±0.95 0.90 0.10
177
When r is 0.5 or less, only a minor portion of the variation in Y can
be attributed to its linear regression on X. At r = 0.7, about half the
variance of Y is associated with X, and at r = 0.9, about 80%. In a sample
of size 200, an r of 0.2 would be significant at the I% level, but would
indicate that 96% of the variation of Y was not explainable through its
relation with X. A verdict of statistical significance shows merely that
there is a linear relation with non-zero slope. Remember also that con-
vincing evidence of an association, even though close, does not prove
that X is the cause of the variation in Y. Evidence of causality must
come from other sources.
Another relation between the sample regression and correlation coeffi-
cients is the following. With Yas the dependent variable,
b = 1:xy = 1:xy . .Jl:y2 = r 2
l:x 2 .J(l:x )(1: y 2) .J'E.x 2
2
Sx

Or, equivalently, r = b.x/s,. Thus b is easily obtained from r, and vice


versa, if the sample standard deviations are known.
In some applications, a common practice is to use the sample stan-
dard deviatio~ as the scale units for measuring the variates x = X - X
and y = Y - Y. That is, the original variates X and Yare replaced by
x' = xis, and y' = y/s" said to be in slumlard units. The sample regres-
sion line
t - Y= b(X - X)
then becomes
bs
P'S), = bx'sx. or ~' = ~ x' = rx '
, s,

where'p' is the predicted value of Yin standard units. In standard measure,


r is the regression coefficient, and the distinction between correlation and
regression coefficients disappears.
7.4-The bivariate Don",11 distributiOD. The popUlation correlation
coefficient p and its 'sample estimate rare intimat('ly connected with a
bivariate population known as the bivariate normal distribution. This
distribution is illustrated by table 7.4.1 which shows the joint frequency
distributions of height (X,) and length of forearm (X2 ) for 348 men. The
data are from the article by Galton (18) in 1888 in which the term "co-rela-
tion" was first proposed.
To be observed in the table are five features:
(i) Each row and each column in the body of the table is a frequency
distribution. Also, the column at the right, headed Ii, is the total fre-
quency distribution of X2 , length of forearm, and the third-to-the-last
row below is that of X" height.
(ii) The frequencies ate concentrated in an elliptical area with the
;:::
e

... _ N

_ CIO !"'I C"'I -


179
major axis inclined upward to the right. There are no very short men
with long forearms nor any, very tall men with short forearms.
(iii) The frequencies pile up along the major axis, reaching a peak
near the center of the distribution. They thin out around the edges,
vanishing entirely beyond the borders of the ellipse.
(iv) The center of the table is at X, = 67.5 inches, X2 = 18.1 inches.
This point happens to fall in the cell containing the greatest frequency,
28 men.
(v) The bivariate frequency histogram can be presented graphically
by erecting a column over each cell in the table, the heights of columns
being proportional to the cell frequencies. The tallest column would be
in the center, surrounded by shorter columns. The heights would de-
crease toward the perimeter of the ellipse, with no columns beyond the
edges. A ridge of tall columns would extend along the major axis.
The shape of the bivariate normal popUlation becomes clear if you
imagine an indefinite increase in the total frequency with a corresponding
decrease in the areas of the table cells. A smooth surface would over-
spread the table, ·rising to its greatest height at the center (il" 1'2)' fading
away to tangency with the XY plane at great distances.
Some properties of this new model are as follows:
(i) Each section perpendicular to the X, axis is a normal distribution,
and likewise, each section perpendicular to the X 2 axis. This means that
each column and each row in table 7.4.1 is a sample from a normal fre-
quency distribution.
(ii) The frequency distributions perpendicular to the X, axis all have
the same standard deviation, 0"2'" and they have means all lying on a
straight regression line, 1'2" = ~2 + 1l2"X" The sample means and
standard deviations are recorded in the last two lines of the table. While
there i~cansiderdble '\tariaciorr in S2.t, each is iHf escimate aftlre-cammon
parameter, ([2'1'
(iii) The frequency distribution perpendicular to the X2 axis have a
common standard deviation, 0",. 2 (note the estimators in tbe right-hand
column of the table), and their means lie on a second-regression line,
1"'2 =~, + 1l"2 X 2'
(iv) Each border frequency distribut,on is normal. That on the right
is %(1'2' 0"2), while the one below the body of the table is%(il" 0",).
(v) The distribution of the bivariate frequency distribution has the
coefficient, 1/21<0,0"2,)(1 - p'), followed by e with this exponent:

- [(X, -1',)2/0", ~ - 2p(X, -1',)(X2 - 1'2)/0",0"2 + (X2 -1'2f10",' 1/2(1- p2)

This distribution has five parameters. four of them are familiar;


Jit, ([2' The fifth is the correlation coefficient, p, of which r is an
112' a l •
estimator. The parameter. P. measures the closeneSs of the popUlation
relation between Xl and X2 ; it determines the narrowness of the ellipse
containing the major portion of the observations.

12
180 c:ItapM 7: Co<roIotio<t
EXAMPLE IA.I-Make 8giaph ofXl . J in thenext-to~the-last line of table 7.4.1. The
values of Xl are the class marks at the top of the columns. The first class mark may be taken
as 59.5 inches.
EXAMPLE 7.4.2-Graph the Xu on the same sheet with that of Xu' The class
marks for Xl afe laid off on the vertical axis. The first class mark may be taken as 21.25
inches. If you are surprised that the tWO regression lines are different, remember that X l . l
is the mean of a column while X l.l is the mean of a row.
EXAMPLE 7 .4.3-Graph S2.1 against Xl' You will see that there is no ttend, indicating
that aD thes 2 . 1 may be random :;amplesfrorna common 11:3:.1'
EXAMPLE 7.4.4-the data in example 6.9.3 may be taken as a random samp\e from a
bivariate normal population. You had X = 83 gms., Y "'" 60 mg., l:x 2 = l,OOO,l:y2 = 6,854,
l:xy == 2,302. Calculate the regression of body weight. X, on comb weight, Y. Ans,
g "'" 83 + 0.336 (Y ~'60) gms. Draw the graph of this line along with that of example
6.9.4. Notice that the angle whose tangent is 0.336 is measured from the Yaxis.
EXAMPLE 7 .4.5-ln the chick: experiment, estimate t1,.Jt' Ans. s,.1t. "'" 13.9 mg. Also
estimate u q , Ans. j'Jt" = 15.1 gms. In $Jt • .,. lhe deviations from regression are measured
horizontally,
EXAMPLE 7.4.6-From the chick data, estimate p. Aos. r = 0.88,
EXAMPLE 1.4.7-U y = a + bu and x = I: + dv. where D, b. c, and d arc constants,
prove that r 1t.y "" rMI"
EXAMPLE 7.4.8-Thirty students scored as follows in two mathematics achievement
tests:

1 73 41 83 71 39 60 51 41 85 88 44 71 52 74 SO
II 29 24 34 27 24 26 35 18 33 39 27 35 25 29 13

I 43 85 53 85 44 66 60 33 43 76 51 57 35 40 76
II 13 40 23 40 22 25 21 26 19 29 25 19 17 17 35

Calculate r = 0.774.

From the formula for r we can derive a much used expression for p.
Write

Dividing both sides by (n - 1), we have

(7.4.1)

As n becomes large, X, and X2 tend to coincide with 1', and 1'2' respec-
tively, 5, and 52 tend to equal <1, and <12' and division by (n - I) becomes
equivalent to division by n. Hence, when applied to the whole population,
equation 7.4.1 becomes
p = {Average value of (X, - I',)(X, - 1',»)/<1,<1 2 (7.4.2)
The numerator of (7.4.2) is called the population covariance of Xl and '"
X" This gives
p = Cov. (X I X , )!I1 I I1, (7.4.3)
7.5-SampIiDg variation of the correlatioa coeIIIclent. Common ele-
ments. A convenient way to draw samples from a normal bivariate popu-
lation is by use of an old device called common elements (17). You may
go back to the random sampling scheme of section 3.3 (p. 69), or to
samples already drawn from table 3.2.1. In a new table, such as 7.5.1.
record some convenient number, say three. of the random pig gains. These
gains, or elements, are written twice in the table. Then continue the draw-
ing, adding for example, one more randomly drawn gain to the left-hand
column, and two more to the right. The sums constitute the paired values
of XI and X" Three such pairs are computed in the table. It is clear that
there is a relation between the two sums in each pair. If the three common
elements all happen to be large, then both XI and X, are likely large ir-
respective of the extra elements contained in each. Naturally. owing to
the non-common elements, the relation is not perfect. If you continue

TABLE 7.5.1
CALCULATION OF THllEE PAlllS OF ""LUES Ofnil: V"RIABLES Xl AND X2 HAVING
CoMMON ELEMENTS
(The elements are pig gains from table 3.2.1)

Pair Elements

;!}-
43
common -+ {;!
43

_: } - differenl _ {~

~}- {~19
2
common -+
19

30 } ..... different -+
J22
)13

Xl = 105

3
23} ...... common --- 3823
38
37 {37
_~ J. . different -+
31
{ 41

XI = 128
182 Chapter 7: earr.latioJt
this process, drawing a hundted or more pairs, and then compute the cor-
relation, you will get a value of r not greatly different from the population
value,
p = 3/.J(4)(5) = 0.67
The numerator of this fraction is the number of common elements, while
the denominator is the geometric mean of the total numbers of elements
in the two sums, X, and X 2. Thus, if n12 represents the number of com-
mon elements, with nil and n22 designating the total numbers of elements
making up the. two sums, then the correlation between these two sums is,
theoretically,
p = nll/~nlln22
Of course, there will be sampling variation in the values calculated from
drawings. You may be lucky enough to get a good verification with only
10 or 20 pair.s of sums. With 50 pairs we have usually got a coefficient
within a few hundredths of the expected parameter, but once we got 0.28
when the population was
n12 /';n,n2 = 61../(9)(16) = 0.5
If you put the same number of elements into X, andX2 , thenn, = n2'
Denoting this common number of total elements by n,
p = n12ln,
the ratio of the number of common elements to the total number in each
sum. In this special case, the correlation coefficient is simply the fraction
of the elements which are common. Roughly, this is the interpretation of
the sister-brother correlation in stature (table 7.1.1), usually not far from
0.5: an average of some 50% of the genes determining height is common to
sister and brother.
Another illustration of this special case arises from the determination
of some physical or 'themic~l constant by two alternative methods. Con-
sider the estimation of the potassium content of the expressed sap of corn
stems as measured by two methods, the colorimetric and the gravimetric.
Two samples are taken from the same source, one being treated by each
of the two techniques. The common element in the two results is the actual
potassium content. Extraneous elements are differences that may eJrist
between the potassium contents of the two samples that were drawn, and
the errors of measurement of the two procedures.
The concept of common elements has been presented because it may
help you to a better understanding of correlation. But it is not intended
as a method of interpreting the majority of the correlations that you will
come across in your work, since it applies only' in the type of special cir-
cumstances that we have illustrated.
When you have carried through some calculations of r with common
elements, you are well aware of the sampling variation of this statistic.
"'
P -0.6

-1.0 -0,8 -O.tP -0.4 -O."Z. 0 O:t 0.4 1<>


VALUES OF .,....

FIG. 7.S.I-Distribution of sample correlation coefficients in samples of 8 pairs drawn


from two normally distributed bivariate populations having the indicated values of p.

However, it would be too tedious to compute enough coefficients to gain


a picture of the distribution curve. This has been done mathematically
from theoretical considerations. In figure 7.5.1 are the curves for samples
of 8 drawn from populations with correlations zero and 0.8. Even the-
former is not quite normal. The reason for the pronounced skewness of
the latter is not hard to see. Since the parameter is 0.8, sample values
can exceed this by no more than 0.2, but may be less than the parameter
""Iue by as much as 1.8. Whenever there is a limit to the variation of a
statistic at one end of the scale, with practically none at the other, the dis-
tribution curve is likely to be asymmetrical. Of course, with increasing
sample size this skewness tends to disappear. Samples of400 pairs, drawn
from a population with a correlation even as great as 0.8, have little
tendency to range more than 0.05 on either side of the parameter. Conse-
quently, the upper limit, unity, would not constitute a restriction, and the
distribution would be almost normal.
EXAMPLE 7.5.1-ln a tea plantation (5). the production of 16 pJots during one l4-week
period was correlated with the production of the same plots in the following period of equal
length. The correlation coefficient was 0.91. Can you interpret this in terms of comlnOD
elements?
EXAMPLE 7.5.2-To prove the result that with common elements, p = nll/"n;;n;;,
start from the result (7.4.3), which gives p = Cov. (XIX)!O"ID".z. If XI is the sum of"11 inde·
pendent drawings from a population with standard deviation (I. then (II = (lJ"II' Similarly.
(11 = (lJnll' To find Cov. (XIX) write XI == C + "I' Xl = C + ".2' where c. the common
part, is the sum of the same set of n 11 drawings. Assuming that the drawings are from a
population with zero mean, XI and X 2 will have zero means. Thus, Cov. (XI Xl) = Average
value of (X1X z) = Average value of (c + ",)(c + "2)' But this is simply the average of c2 •
or in other words the variance of c, since the terms cu 2• CUI and "IU2 all have averages zero
because c.", and "2 result from independent drawings. Finally. the variance of cis (t2 n12 •
giving p = D'2 nu /(cr y 'n, ",cr.Jn22) = nI21"lnl1n].2'
EXAMPLE 7.S.1-Suppose that ",. "2. "]
are independent draws from the same
population. and that Xl = lUi + "2. X 2 = lUI + "]. What is the correlation p between X,
and X/t Ans.0.9. More generally, if Xl =/U\ + U2' X 2 =fu J + U]. then p =.f2lif + I).
This result provides another method of producing pairs of correlated variates.

7.6-Testing tile noH hypothesis p = O. From the distribution of r


when p = 0, table A 11 gives the 5% and 1% significance levels ofr. Note
that the table is entered by the degrees of freedom, in this case n - 2.
(This device was adopted because it enables the same table to be used in
more complex problems.) As an illustration, consider the value r = 0.597
which was obtained from a sample of size 9 in diagram C of figure 7.2.1.
For 7 dj, the 5% value of r in table A 11 is 0.666. The observed r is not
statistically significant, and the null hypothesis is not rejected. This ex-
ample throws light on the difficulty of graphical evaluation of correlations.
especially when the number of degrees of freedom is small-they may be
no more than accidents of sampling. Since the distribution of r is sym-
metrical when p = 0, the sign of r is ignored when making the test.
Among the following correlations, observe how conclusions are
affected by both sample size and the size of r:

Number of Degrees of Conclusion About


Pairs Freedom Hypothesis. p = 0

20 18 0.60 Reject at I % level


100 98 0.21 Reject at 5% level
10 8 0.60 Not rejected
15 13 -0.50 Not rejected
SOD 498 -0.15 Reject at I~'~ leloiel

You now know two methods for testing whether there is a linear rela-
tion between the variables Yand X. The first is to test the regression
coefficient h,.x by calculating I = b,.xis. and reading the t-table with
(n - 2) df The second is the test of r. Fisher (8) showed that the two
tests are identical. In fact, the table for r can be computed from the
I-table by means of the relation
1= b,.xj.,. = r../(n - 2)/../(1 - r'), df =n - 2 (7.6.1 )
(See example 7.6.1). To illustrate, we found that the 5~~ level of I' for
7 df was 0:666. Let us compute
1= (0.666)../7/../ (I - (O.666)'} = 2.365
Reference to the I-table(p. 549) shows that this is the 5% level of I for 7 df
In practice. use whichever test you prefer.
185
This relation raises a subtle point. The I-test of b requires only that
Y be normally distributed: the values of X may be normal or they may be
selected by the investigator. On the other hand. we have stressed that r
and p are intimately connected with random samples from the bivariate
normal distribution. Fisl!er proved, however, that in the particular case
p = 0, the distribution of r is the same whether X is normal or not, pro-
vided that Y is normal.
EXAMPLE 7.6.1-To prove relation (7.6.1) which connects the Hest of b with the
test ofr, you need three relations: (i)b"" = rs,/s",(ii)S6 = s,."I.J~X2.(iij)s.,./ = (\ - r2)t y l
fen - 2), as shown in equation (1.3.1), p. 176. Start with r = b,js. and make these substitu-
tions to establish the result.

7.7~oofidetlCe limits and tests of hypotheses about p. The methods


given in this section, which apply when p is not zero, require the assump-
tion that the (X, Y) or (X" Xl) pairs are a random sample from a bivariate
normal distribution.
Table A II or the t-table can be used only for testing the null hy-
pothesis p = O. They are unsuited for testing other null hypotheses, such
as p = 0.5 for example, or p, = Pl' or for makin!', confidence statements
aboutp. Whenp #' othe shape of the distribution ofrchanges, becoming
skew, as was seen in figure 7.5.1.
A solution of these problems was provided by Fisher (9) who de-
vised a transformation from r to a quantity z, distributed almost normally
with standard error
I
Uz = ,
.,f(n - 3)
"practically independent of the value of the correlation in the population
from which the sample is drawn." The relation of z to r is given by
z = Hlo!!.(1 + r) -log,(1 - r)]
Table A 12 (r to z) and A 13 (z to r) enable us to change from one to the
other with sufficient accuracy. Following are some examples of the use
of z.
I. II is required to set confidence limits 10 the !'alue of p in the popula-
tion from which a sample r has been drawn. As an example, consider
r = - 0.889, based on 9 pairs of observations, figure 7.2.1 F. From table
A 12,: = 1.417 corresponds tor = 0.889. Sincen = 9,11, = I/J6 = 0.408.
Since z is distributed almost normally, independent of sample size,
zo.o, = 2.576. For P = 0.99, we have as confidence limits fotz,
1.417 - (2576)(0.408) :$ Z :$ 1.417 + (2.576)(0.408),
0.366 ,; Z ,; 2.468
Using table A 13 to find the corresponding T, and restoring the sign. the
0,99 confidence limits for p are given by
-0.986'; I' <; -0.350
186 CIIapIe, 7: c-elatioll
Emphasis falls on two facts: (i) in small samples the estimate, r, is not
very reliable; and (ii) the limits are Dot equally spaced on either side of r,
a cor.sequence of its skewed distribution.
2. Occasionally, there is reason to test the hypothesis that p has some
particular value, other than zero, in the sampled population (p = 0, you re-
call, is tested by use of table A II). An example was given in section 7.5,
where r = 0.28 was observed in a sample of 50 pairs from p = 0.5. What
is the probability of a larger deviation? For r = 0.28, z = 0.288, and for
p = 0.5, z = 0.549. The difference, 0.549 - 0.288 = 0.261, has a standard
error, I/.J(n - 3) = 1/.J47 = 0.1459. Hence, the normal deviate is 0.261/
0.1459 = 1.80, which does not reach the 5% level: the sample is not as
unusual as a l-in-20 chance.
3. To test the hypothesis that two sample values of r are drawn at
random from the same population, convert each to z, then test the signifi-
cance of the difference between the two z's. For two lots of pigs the cor-
relations between gain in weight amount of feed eaten are recorded in
table 7.7.1. The difference between thez-values, 0.700, has the mean square
I I I I
- - + - - ' " -+ -=0.611
", - 3 ", - 3 2 9
The test is completed in the usual manner, calculating the ratio of the dif-
ference of the z's to the standard error of this difference. With P = 0.37
there is no reason to reject the hypothesis that the z's are from the same
population, and hence that the r's are from a common population cor-
relation.
4. To test the hypothesis that several r's are from the same p, and to
combine them into an
estimate of p. Several sample correlations may
possibly be drawn from a common p. If this null hypothesis is not re-
jected, we may wish to combine the r's into an estimate of p more reliable
than that afforded by any of the separate r's. Lush (14) was interested in
an average of the correlations between initial weight and gain in 6 lots
of steers. The computations are shown in table 7.7.2. Each z is weighted
(multiplied) by the reciprocal of its mean square, so that small samples

TABLE 7.7.1
TFST OF SlGNlFlCANtE OF THE DlFFEJt.ENU BETWEEN Two Coit.RE:LAnoNS OF GAIN
WITH FEED EATEN AMONG SWINE

Lo! Pigs in Lot , z 1/(_ - 3)

I 5 0.870 1.333 0.500


2 12 0.560 0.633 0.111

Difference = 0.700 Sum =0:0.611

u" '," ~ .jQ.611 ~ 0.782. 0.700/0. 782 ~ 0.895. P ~ 0.37


187
TABLE 7.7.2
TEST OF HYPOTHESIS OF CoMMON P AND EsTIMATION OF p. CoiutELATlON BETWEEN
INlTlAl. WEIGHT AND GAIN OF SlllERS

Weighted z
I Weighted Cor-

Samples
No.
=n n-3 , , =(n-3}z
Square
=(n-3),'
reeted
=
1927 Herefords 4 I 0.929 1.651 \ 1.651 2.726 1.589
1927 Brahmans 13 10 0.570 0.648 6.480 4.199 0.633
1927 Backcrosses 9 6 0.455 0.491 2.946 1.446 0.468
1928 Herefords 6 3 -0.092 -0.092 -0.276 0.Q25 -0.055
1928 Brahmans II 8 0.123 0.124 0.992 0.123 0.106
1928 Backcrosses 14 II 0.323 0.335 3.685 1.234 0.321

57 39 15.478 I 9.753 14.941


I
Average z",= 0.397 6.145 z = 0.383
i
Average r = 0.371 X' = 3.608 r = 0.365

have little weight. The sum of the weighted z's, 15.478, is divided by the
sum of the weights, 39, to get the average Zw = 0.397.
The next column contains the calculations that lead to the test of the
hypothesis that the six sample correlations are drawn from a common
population correlation. The test is based on a general result that if the k
normal variates z, are all estimates of the same mean iJ, but have different
variances a?, then
I:w,(z, - zw)2 = I:w,z.' - (I:w,Z,)2jI:w,
is distributed as X2 with (k - I) dj., where w, = l/u, 2 • In this application,
Wi = ni - 3 and
X' = I:(n - 3)Z2 - [I:(n - 3)z j2 jl:(n - 3)
= 9.753 - (15.478)2/39 = 3.610,
with 5 degrees of freedom. From table A 5, p. 550, P = 0.61, so that Ho
is not rejected. .
Since the six sample correlations may all have been drawn from the
same population, we compute an estimate of the common p. This is got
by reading from table A 13 the correlation 0.377 corresponding to the
average Zw = 0.397. Don't fail to note the great variation in these small
sample correlations .. The S.D. of;' is I/J39.
Fisher pointed out that there is a small bias in z, each being too large
by
p
2(n - I)

The bias may usually be neglected. It might be serious if large numbers of


correlations were averaged, because the bias accumulates, one bit being
J88 Chapt.r 7: Correlation
added with every t. If there is need to increase accuracy in the calcula-
tion of table 7.7.2, the average r = 0.377 may be substituted for p; then'
the approximate bias for each t may be deducted, and the calculation of
the average z repeated. Since this will decrease the estimated r, it is well
to guess p slightly less than the average r. For instance, it may be guessed
that p = 0.37, then the correction in the first z is 0.37/2(4 - I) = 0.062,
and corrected z is 1.651 - 0.062 = 1.589. The other corrected z's are in
the last column of the table. The sum of the products,
I:(n - 3)(corrected z) = 14.941,
is divided by 39 to get the corrected mean value of z, 0.383. The cor-
responding correlation is 0.365.
For tables of the distribution of r when p #. 0, see reference (4).
EXAMPLE 7.7.1-To get an idea of how the selection of pairs affects correlation. try
picking the five lowest values of test II (example 7.4.8) together with the six highest. The
correlation between these II scores and 'the corresponding scores on test I turns out to be
0.89, as against r = 0.77 for the original sample.
EXAMPLE 7.7.2-Set 95% confidence limits to the correlation, 0.986. n = 533, be-
tween live and dressed weights of swine. Ans. 0.983 - 0.988.
What would have been the confidence limits if the number of :iwine had been 25?
Ans. 0.968 - 0.994.
EXAMPLE 7.7.3--10 four studies of the correlation between wing and tongue length
in bees, Grout (10) found values. of r = 0.731,0.354,0.690, and 0.740, each based on a sample
of 44. Test the hypothesis that these are samples from a common p. Ans. X2 = 9. J 64.
df = 3, P = 0.03. In only about three trials per 100 would you expect such diagreement
among four correlations drawn from a common population, One would like to know more
about the discordant correlation, 0.354, before drawing conclusions.
EXAMPLE 7.7 .4·--Estimate p in the population from which the three bee correlations,
0.731. 0.690, and 0.740, were drawn. Ans. 0.72],
EXAMPLE 7.7 .5~-Set 99% confidence limits on the foregoing bee correlation. Note:
r = 0.721 is based on (n ~ 3) = 3 x 41 = 123. The value of z is therefore equivalent to a
single z from a sample of 123 + 3 = 126 bees. The confidence limits are: 0.590 - 0.815.

7.S-Practical utility of correlation and regression. Over the last


forty years, investigators have tended to increase their use of regression
techniques and decrease their use of correlation techniques. Several
reasons can be suggested. The correlation coefficient r merely estimates
the degree of closeness of linear relationship between Yand X. and the
meaning of this concept is not easy to grasp. To ask whether the relation
between Y and X is close or loose may be sufficient in an early stage of
research. But more often the interesting questions are: How much does
Y change for a given change in X" What is the shape of the curve con-
necting Y and X" How accurately can Y be predicted from X" These
questions are handled by regression techniques.
Secondly, the standard results for the distribution of r as an estimate
of a non-zero p require random sampling from a bivariate normal popula·
tion. Selection of the values of X at which Y is measured. often done in·
189
tentionally or because of operational restrictions, can distort the frequency
distribution of r to a marked degree.
The correlation between two variables may be due to their common
relation to other variables. The organic correlations already mentioned
are examples. A big animal tends to be big all over, so that two parts are
correlated because of their participation in the general size. Over a period
of years, many apparently unrelated variables rise or fall together within
the same country or even in different countries. There is a correlation of
-0.98 between the annual birthrate in Great Britain, from 1875 to 1920,
and the annual production of pig iron in the United States. The matter
was discussed by Yule (19) as a question: Why do we sometimes get
nonsense-correlations between time series? Social, economic, and tech-
nological changes produce the time trends that lead to such examples.
In some problems the correlation coefficient enters naturally and use-
fully. Correlation has played an important part in biometrical genetics,
because many of the consequences of Mendelian inheritance, and later
developments from it, are expressed conveniently in terms of the correla-
tion between related persons or animals.
A second example occurs when we are trying to select persons with
high values of some skill Y by means of examination results X. If Yand
X follow the bivariate normal distribution, the average Y value, say Y,
of candidates whose exam score is X is given by the equation
(Y - il-y)la y = p(X - il-x)lr1 x
Suppose we select the top P"10 in the exam. For the normal curve, the
average value of (X - I'x)/r1x for the selected men may be shown to be
HIP when there a;e many candidates, where H is the ordinate of the
normal curve at the point that separates the top P"10 from the remaining
(I - P)%. When P = 5%, the ordinate H = 0.1032, and HIP = 2.06.
Thus the average Y value of the top 5% is 2.06p in standard units. If
p = 0.5 this average is 1.03. From the normal tables we find that when
HIP = 1.03, the corresponding P is 36%. This means that with p = 0.5,
the5% most successful performers in the exam have only the same average
ability as the top 36% of the original candidates. The size of p is the key
factorin determining bow well we can select high values of Y by a screening
process based on X.
In hydrology, suppose tbat there are annual records Y of the flow
of a stream for a relatively short period of m years,'and records X of a
neighboring stream for a longer period of n years. Instead of using Ym
as the estimate of the long-term mean I'y of Y, we might work out the
regression of Yon X and predict I'y by the formula

The proportional reduction in variance due to this technique, known


as stream extension, is approximately
190 Cilapter 7: Correlation

V(Ym ) - V(py) ,,}n - m)r 1


p
_ (1 - p')]
V(Ym ) n ~ m - 3

Here again it is the value of P. along with the lengths of run available
in the tWo streams. that determines whether this technique gives worth-
while gains in precision. '

7.9-Variances ofsums and differences of correlated variables. When


XI and X, are independent. a resuil used previously is that the variance of
their sum is the sum of their variances. When they are correlated. tne
mOre general result is

(7.9.1)

Positive correlation increases the variance of a sum, negative correlation


decreases it. The corresponding sample result is

(7.9.2)

This identity is occasionally used as a check on the computation of 5 1.5,.


and r from a sample. For each member of the sample. XI + X, is written
down and the sample variance of this quantity is obtained in the usual way.
For the difference D = XI - X,. the variance is

(7.9.3)

With differences. positive correlations decrease the variance. In paired


experiments. the goal in pairing is to produce a positive correlation p
between the members XI' X, of a pair. The pairing does not affect the
term (0'1' + 0'/) in (7.9.3). but brings in a negative term. 2pU I U 2 •
If we have k variates, with Pi; the correlation between the ith and the
jth variates, their sum S = Xl + X 2 + ... + X. has variance
(J/ = 0'1
2
+ (1/ + ... + (/,/ + 2PllO'l0'2 + 2P130'1(J3 + ...
+ 2p._I •• Ut_IUt (7.9.4)

where the cross-product terms 2pij(JjCTJ extend over every pair of variates.
EXAMPLE 7.9.1~ To prove formula (7.9.1), note that by the definition of a variance.
the variance of XI + X2 is the average value of (X, + X 2 - PI - ,u2)2. taken over the popula-
tion. Write this as
E{(XI - lid + (X2 - 1l2}}2 = E(X1 - Ild 2 + E(X2 - JiI)l + 2E(X j - JiI)(X2 - Ji2)

where the symbol E (expected value) stands for ""the average value of." This gives

since by equation (7.4.2) (p. 180), E(X 1 - JJl)(X1 - Ji2) = /)tJltr l . Formulas (7.9.3) and
(7.9.4) are proved in the same way.
191
EXAMPLE 7.9,2-ln a sample of 300 ears ofcaro (7), the weight of the grain. G, had a
standard deviation s, = 24.62 gms.; the weight of the cob, C, had a standard deviation
St = 4.19 gms.; andr" was 0.6906. Show that the total ear weight W = G + Chad ~w = 27.7
gms. and that r.., = 0.994.
EXAMPLE 7.9.3-ln table 7.1. 1. subtract each sister's height from her brother's. then
compute the corrected sum of squares of the differences. Verify by formula (7.9.3) that your
result agrees with the values I:x 12 = 74 l:x/ "'" 66, I:x1Xl = 39. given under table 7.1.1.
EXAMPLE 7.9.4-If rn = t. show that $n = .$, - 52' where $\ ~ S2'

7.10-The calculation of r in a large sample. When the sample is


large, the variates X and Yare often grouped into classes, as illustrated
in table 7.10.1 for a sample of 327 ears of corn (20). The diameters X
are in millimeter classes and the weights Y in 10-gram classes. The figures
in the body of the table are the frequencies h, in each X and Y class.
Looking at the class with diameter' 48 and weight 300, we see that there
wereh, = 3 ears in this class, i.e., with diameters between 47.5 and 48.5
mm., and weights between 295 and 305 gms. Correlation in these data
is evidenced by the tendency of high frequencies to lie along the diagonal
of the table, leaving two corners blank-there are no very heavy ears with
small diameters.
The steps in the calculation are as follows;
1. Add the frequencies in each row, giving the column of valuesJ"
and in each column, giving the row of values h.
2. Construct a convenient coding of the weights and diam~ters. writ-
ing down the coded Y and X values.
3. Write down a column of the values YJ, and a row of the values Xix,
4. The quantities r,Xf., r, Yf" r,x' and r,y2 are now found on the
calculating machine in the usual way, and are entered in table 7.10.2.
5. The device for finding r,xy is new. In each row. multiply the
Jx, by the corresponding coded X, and add along the row. As examples:
(i) In the 4th row: (1)(2) + (1)(4) = 6
(iii) In the 7th row: (1)( -2) + (3)( -I) + (7)(1) + (3)(3) + (.1)(4) = 23
These are entered in the right-hand column, r,XIx,. Then form the
sum of products or this column with the coded Y column. giving
r,XYh, = 2,318. The correction term is subtract~d as shown in t.bie
7.10.2 to give LXY = 2,323.20. ~
6. The value of r is now computed (table 7.1O.:n No de"oding is
necessary for f.
As partial checks. the hand 1, values hoth add to the sample sizo.
while the column LXh, in step 5 adds to the value r, )(/, found in stop 4
A large sample provides a good opportunity for checklDS the .5'
sumptions required for the distribution of r. If each number LX!., in
the right-hand column is divided by the correspondmgjy, we obtain the
mean of X in each array (weight class). These may be plotted against Y
to see whether the regression of X on Y app~ars linear. Similarly, by
~~~~~~~~~~~~~~~~~~~~~~~~~~~~
I I I I I I I I I I I I I I I

~
I

~=~~~~~~~~M-Q-M~~~~~~~~=~~~~
I I I I I I I I I I I I I I I

...
__ N

_ N __ N

;;; N

~ -
o

;;:;
I

:'I
I
__ NtrlN N _
~
~
_
~
N
;;: i'!'
I
_ V'\_trI ~ __ _
;::!
I
- N-_ ::l
I
... '"I ~
I
or-- Q
I
__ 0000
I I
193
TABLE 7.10.2
CALCUl,.ATION OF COJUtELATJON COEFFJCIEN't IN TABLE 1.10. t

:EX/. = 37 :E Yf. = -46


:EX'!. = 2,279 I Y'/, = 7,264 tXY!., = 2,318
fr.X!.)'/. = 4.19 Cr. Y/,)'/n = 6.47 (IX/.)(I yr,)/. = - 5.20
Ix' = 2,274.81 ty' = 7,257.53
:Exy 2,323.20
0.5718
,= .j(txJ)(ty') = .J(t,274.81)(7,257.S3)

extra calculation the values 1: YIx, and the Y means may he obtained for
each column and plotted against X. A test for linearity of regression is
given in section 15.4. The model also assumes that the variances of Yin
each column, sr/, are estimates of the same quantity, and similarly for
the variances of sx./ of X within each row. Section 10.21 supplies a test
of homogeneity of variances.
EXAMPLE 7.10.1-Using the data in columns!, and Y. table 7.10.1. calculatel:yl
= 7,257.53,
together with the sample mean and standard error, 198.6 ± 2.61.
EXAMPLE 7.1O.2--Calculate the sample mean, 44.1, and standard deviation. 2.64,
in the 42-millimeter array of weights. table 7.10.1.
EXAMPLE 7.10.3-10 the 200-gram array of diameters. compute X == 198.6 and
5=47.18.
EXAMPLE 7.1O.4---Compute the sample regression coefficient of weight on diameter.
1.0213. together with the regression equation. f = I ,Oi'l3X + I S4.8 \.
EXAMPLE 7.IO.S-Calculate the mean diameter in each of the 28 weight.arrays. Plot
these means against the weight class marks. Does there seem to be any pronounced cuni·
lineal ity in the regression of these mean diameters on the weight'? Can you write the regres-
sion equation giving estimated diameter for each weight?
EXAMPLE 7.1O.6--Calculate the sample mean weight of the ears in each. of the 16
diameter arrays of table 7.10.1. Present these means graphically as ordinates with the
corresponding diameters as abscissas. Plot the graph of the regression equation on the
same figure. Do you get a good fit'? Is there any evidence ofcurvilinearity in the regression
of means'?

7.II-NOII-parametric methods. Rank correlation: "' Often, a bivariate


population is far from normal. In that event. the computation of r as
an estimate of p is no longer valid. In some cases a transformation of the
variables X, and X, brings their joint distribution close to the bivariate
normal, making it possible to estimate p in the new scale. Failing this.
methods of expressing the amount of correlation in non-normal data by
mean, of a parameter like p have not proceeded very far.
Nevertheless, we may still want to examine whether two variables are
independent, or whether they vary in the same or in opposite directions.
For a test of the null hypothesis that there is no correlation. r may be used
provided thalone of the variables is r ormal. When neilher variable seems
194 CItopter T: Cotr./aIioII
TABLE 1.11.1
RANKINO Of SEVEN RATS BY Two OBSERVERS Of 'fIIEIR CoNDrnoN AFI'ER 1'HJtEE WI!IIXS

to;
ON A Dl!PiCII!NT DIET

Ranking by
Rat Difference,
Number bserver 1 Observer 2 d d'
1 I 4 4 0 0
2 1 2 -1 1
l 6 5 1 1
4 5 6 -I 1
5 l 1 2 4
6 2 l -1 1
1 1 1 0 0

:Ed = 0 l:.d a ,. 8
---
6I.d 2 6x S
rs = 1- n(n' - 1)
-1-
1(49 - 1)
0.851

normal, the best-known procedure is that in which X, and X, are both


rankings. If two judges each rank 12 abstract paintings in order of at-
tractiveness, we may wish to know whether there is any degree of agree-
merit among the rankings. Table 7. 11.1 shows similar rankings of the
condition of 7 rats after a period of deficient feeding. With data that
are not initially ranked, the first step is to rank X, and X, separately.
The ,ank correlation coefficient, due to Spearman (I I) and uSually
denoted by,s, is the ordinary correlation coefficient, between the
,anked values X, and X,. It can be calculated in the usual way as
:E(x,x,)I.j(:Ex/)(I:x/). An easier method of computing' is given by
the formula

6I:d'
's = 1- 1,
n(n - 1)
,

whose calculation is explained in table 7.11.1. Like" the rank correlation


can range in samples from -I (complete discordance) to + 1 (complete
concordance).
For samples of 10 or fewer pairs, the significance levels of,s, worked
out by Kendall (12), (13), are given in table 7.11.2. In the rankings ofthe
rats, 's = 0.857 with 7 pairs. The correlation is significant at the 5% level
but not at the 1%. For samples of more than 10 pairs, the null distribu-
tion of,s is similar to that of" and table A II is used for testing 's' Re-
member that the degrees of freedom in table A II are two less than the
number of pairs (size of sample).
Another measure of degree of concordance, closely related to 's, is
Kendall's f (12). To compute this, rearrange the two rankings so that
J'~
TABLE 7.11.2
SlGNGtCANCE LEVELs OF rS IN SMALL SAMPLES

Size of Sample 5% Level 1% Level


4 or less none none
5 1.000 none
6 0.886 1.000
7 0.750 0.893
8 0.714 0.857
9 0.683 0.833
10 0.648 0.794
11 or more Use tabl. A II (po 557)

one of them is in the order I, 2, 3, ... n. For table 7.11.1, putting ob-
server 1 in this order, we have:

Rat No. 2 6 5 4 3 7

Observer I I 2 3 4 5 6 7
Observer 2 2 3 I 4 6 5 7

Taking each rank given by observer 2 in turn, count how many of the
ranks to the right of it are smaller than it, and add these counts. For the
rank 2 given to rat No. 2 the count is I, since only rat 5 has a smaller
rank. The six counts are I, I, 0, 0, I, 0, there being no need to count the
extreme right rank. The total is Q = 3. Kendall's < is

<= 1 _ 4Q =1 _ 12 = ~ = 0.714
n(n - 1) 42 7
Like r" t lies between + 1 (complete concordance) and -I (complete
disagreement). It takes a little longer to compute, but its frequency dis-
tribution on the nul) hypotheses is simpler and it can be extended to study
partial correlation. For details, See (12).
The quantities r, and t can be used as a measure of ability to appraise
or detect something by ranking. For instance, a group of subjects might
each be given bottles containing four different strengths of a delicate per-
fume and asked to place the bottles in order of the concentration of per-
fume. If XI represents the correct ranking of the strengths and X2 a
subject's ranking, the value of r, Or t for this subject measures, although
rather crudely, his success at this task. From the results for a sample of
men and women, we could investigate whelher women are belter al lhis
task than men. The difference between f or r, for women and men could
be compared, approximately, by an ordinary I-test.
7.12-The comparison or lwII correlaled variances. In section 4.15
11'. 116) we showed how to test the null hypothesis that two indep'NiertI

13
196 CMpIw 1: CotreIation
estimates of variance, S ~' and s,' , are each estimates of the same unknown
population variance". The procedure was to calculate F= s/Is,',
where s/ is the larger of the two, and refer to table 4.15.1 or table A 14.
This problem arises also when the two estimates s/ and s,' are cor-
related. For instance, in the sample of pairs of brothers and sisters
(section 7.1.), we might wish to test whether brother heights, X" are
more or less variable than sister heights, X,. We can calculate s/ and
s,', the variances of the two heights between families. But in our sample
of II families the correlation between X, and X, waS found to be, = 0.558.
Although this did nol reach the 5% level of, (0.602 for 9 df), the presence
of a correlation was confirmed by Pearson and Lee's value of, = 0.553
for the sample of 1,40 I families from which our data were drawn. In
another application, a specimen may be sent to two laboratories that
make estimates X" X, of the concentration of a rare element contained
in it. Ifa number of specimens are sent, we might wish to examine whether
one laboratory gives more variability in results than the other.
The test to be described isvalid for a sample of pairs of values X,, X,
that follows a bivariate normal. It holds for any value p of the population
correlation between X, and X,. If you are confident that p is zero, the
ordinary F-test should be used, since it is slightly more powerful. When
p is not zero, the F-test is invalid.
The test is derived by an ingenious approach due to Pitman (15).
Suppose that X, and X, have variances" / and ",' and correlation p.
The null hypothesis states that" / = ",': for the moment, we are nol
assuming that the null hypothesis is necessarily true. Since X, and X,
follow a bivariate normal, it is known that D = X, - X, and S = X, + X,
also follow a bivariate normal. Let us calculate the correlation PDS be-
tween D Ilnd S. From section 7.9,
2 2 2
(fD =' (Tt + (12 - 2fX1t(12
2 2 2
(15 = 0'1 + (12 + 2p U t(12
Cov.(DS) = Cov.(X, - X,)(X, + X,) = a,' - a,'
since the two terms i!! Cov. (X,X,) cancel. Hence
PDS = (a,' - a,')I.j{(uI' + a,')' - 4p'a , 'a,'}
If I/> = a,' la,' is the variance-ratio of a I' to a,', this may be written
POS = (I/> - 1)1.j{(1/> + I)' - 4p'l/>} (7.12.1)
Under the null hypothesis, t/> = I, so that PDS = O. If a.' > a,', then
PDS is positive, while if a,' < a,', PDS is negative.
I/> > I and
Thus, the null hypothesis can be tested by finding D and S for each
pair, computing the sample correlation coefficient 'DS. and referring to
table A II. A significantly positive value of 'os indicates a,' > a/,
while a significantly negative one indicates (J 1 2 < q 22.
Alternatively, by the same method that led to equation (7.12,1), r DS
can be computed as
197
rDS = (F - 1)/,j{(F + I)' - 4r'F}, (7.12.2)
where F = .,2/S/ and r is the correlation between X, and X,.
In a sample of 173 boys, aged 13-14, height had a standard deviation
" = 5.299, while leg length gave ., = 4.766, both figures being expressed
as percentages of the sample means (16). The correlation between height
and length was r = 0.878, a high value, as would be expected. To test
whether height is relatively more variable than leg length, we have
F = (5.299/4.766)' = 1.237
and from equation (7.12.2),
r". = (0.237)/,j{(2.237)' - 4(O.878)'(1.237)} = 0.237/1.136 = 0.209
withdf = 173 - 2 = 171. This value OfrDS is significant at the 1% level,
since table A II gives the 1% level as 0.208 for 150 df
The above test is two-tailed: for a one-tailed test, use the 10"1. and 2%
levels in table A II.
This approach also provides confidence limits for 41 from a knowledge
of Fand r. The variates D' = (X,/O", - X,/O",) and S' = (X,/O", + X,/O",)
are uncorrelated whether 0", equals 0", or not. The sample correlation
coefficient between these variates, say R, therefore follows the usual dis-
tribution of a sample correlation when p = O. As a generalization of
formula 7.12.2, the value of R may be shown to be
R = (F - q,)/,j{(F + 41)2 - 4r2Fq,}
In applying this result, it is easier to use the t-table than that of r. The
value of c is
c = (F - q,)~/2,j{(1 - r2)Fq,} (7.12.3)
If 41 is much smaller than F, t becomes large and positive: if 41 is much larger
than F, c becomes large and negative. Values of 41 that make c lie between
the limits ± to.o, form a 95% confidence intervaL The limits found by
solving (7.12.3) for 41 are computed as
41 = F{K ± ,j(K' - IH,
where

=. I 2(1 - ,')10.0'>
K +
(n - 2)
df for '0.0' = n- 2

REFERENCES
l. K. PEARsoN and A. LEE. Biometrika, 2: 357 (1902-3).
2. A. E. BRANDT. Ph.D. Thesis, Iowa State University (1932).
3. A. T. CRATHORNE. Reorganization of Mathematics i" &coluJary Education, p. lOS.
Math. Assoc. of America, Inc. (1923).
198 Chapter 7: Com.IotioIt
4. F. N. DAVID. Tabks oflhe Correlation Coefficient. Cambridge University Press (1938).
5. T. EDEN. J. Agrie. Sci., 21 :547 (1931).
6. J. M. EVVAllD. M. G. SNELL, C. C. CULBERTSON, and O. W. SNEDBCOIl. Proc. Amer.
Soc. AnirM/ Production, p. 2 (1927).
7. E. S. HABER. Data from the Iowa Agric. Exp. Sta.
8. R. A. FISHER. Biometrika, 10:507 (1915).
9. R. A. FisHER. Metron,l:3 (1921).
10. R. A. GROUT. Iowa A.gTie. Exp. Sta. BuJ. 218 (1937).
II. C. SPEARMAN. Amer.J. Psych .• 15:88(1904).
12. M. G, KENDALL. Rank Correlation Methods. 2nd ed., Charles GriOin. London (1955).
13. S. T. DAVID. M. G. KE""ALL, and A. STUAIlT. Biometrika, 38: 131 (1951).
14. 1. L. LUSH. J. Agrie. Res" 42:853 (1931)
IS. E. J. G. PITMAN. Biometrika, 31:9 (1939).
16. A. A. MUMFORD and M. YOUNG. Biometriko., IS: 108 (1923).
17. c. H. FISHER. Ann. Math. Statist" 4: 103 (19'33).
18. F. GALTON. Proe. Roy. Soc. f.,ondon, 45: 135 (1888).
19. G. UDNY YULE. J. Roy. Slatisl. Soc .• 89: 1 (1926).
20. E. W. LtNDOTR"". A.mer. Nat., 49:311 (1935).
* CHAPTER EIGHT

Sampling from the binomial


distribution
8.1-lntroduction. In chapter I the sampling of attributes was used
to introduce some common statistical terms and techniques----<!stimators,
confidence intervals, the binomial distribution, tests of significance, and
the chi·square test as applied to a simple proportion. We return to the
sampling of attributes in order to fill In the mathematical background of
these techniques. The binomial distribution and its relation to the normal
distribution will be examined more thoroughly. Further, just as you
learned how to compare the means of two normal samples, independent
or paired, we shall study the comparison of two proportions from inde-
pendent samples and from paired samples.
Suppose that an attribute is possessed by a proportion p of the mem-
bers of a population. A random sample of size n is drawn. The binomial
distribution gives a formula for the probability that the sample contains
exactly r members having the attribute. The formula is derived from some
rules in the theory of probability, now to be explained.
8.2-Some simple rules of probability. The study of probability
began around three hundred years ago. At that time, gambling and games
of chance had become a fashionable pastime, and there was much interest
in questions about the chance that a certain type of card would be drawn
from a pack, or that a die would fall in a certain way.
In a problem in probability, we are dealing with a trial, about to
be made. that can have a number of different outcomes. A six-sided die.
when thrown, may show any of the numbers I, 2, 3, 4, 5, 6 face upward.-
these are the outcomes. Simpler problems in probability can often be
solved by writing down all the different possible outcomes of the trial.
and recognizing that these are equall>' likely. Suppose that the letters
a, b, c, d, e.j, g are written on identical balls which are placed in a bag and
mixed thoroughly. One ball is drawn out blindly. Most people would
say without hesitation that the probability that an a is drawn is 117,
because there arc 7 balls, one of them is certain to be drawn, and all are
equally likely. In general terms, tbis result may be stated as follows.
199
200 Chapter 8: SompliJIlI From lite Binomial Distribution
Rule I. If a trial has k equally likely outcomes, of which one and
only one will happen, the probability of any indiyidual outcome is Ilk.

The claim that the outcomes are equally likely must be justified by
knowledge of the exact nature of the trial. For instance, dice to be used
in gambling for stakes are manufactured with care te ensure that they are
cubes of even density. They are discarded by gambling establishments
after a period of use, in case the wear, though not detectable by the naked
eye, has made the six outcomes no longer equally likely. The statement
that the probability is 1/52 of drawing the ace of spades from an ordinary
pack of cards assumes a thorough shuffling that is difficult to attain, par-
ticularly when the cards are at all worn.
In some problems the event in which we are interested will happen if
anyone of a specific group of outcomes turns up when the trial is made.
With the letters a, b, c, d, e,f, g, suppose we ask "what is the probability
of drawing a vowel?" The event is now" A vowel is drawn." This will
happen if either an a or an e is the outcome. Most people would say that
the probability is 2/7, because there are 2 vowels present out of seven
competing letters, and each letter is equally likely. Similarly, the prob-
ability that the letter drawn is one of the first four letters if 4/7. These
results are an application of a second rule of probability.

Rule 2. (The Addition Rule). If an event is satisfied by anyone of a


group of mutually exclusive outcomes, the probability of the event is the
sum of the probabilities of the outcomes in the group.
In mathematical terminology, this rule is sometimes stated as:
P(E) = P(O, or 0, or ... or Om) = P(O,) + P(O,l + ... + P(Om)'
where P(O,) denotes the probability of the ith outcome.
Rule 2 contains one condition: the outcomes in the group must be
mutually exclusive. This phrase means that if anyone of the outcomes
happens, all the others fail to happen. The outcomes "a is drawn" and
"e is drawn" are mutually exclusive. But the outcomes "a vowel is drawn"
and "one of the first four letters is drawn" are not mutually exclusive,
because if a vowel is drawn, it might be an a, in which case the event "one
of the first fOur letters is drawn" has also happened.
The condition of mutual exclusiveness is essential. If it does not
hold, Rule 2 gives the wrong answer. To illustrate, consider the prob-
ability that the letter drawn is either one of the first four letters or is a
vowel. Of the seven original outcomes, a, b, c, d, e, f, g, five satisfy the
event in question, namely a, b, c, d, e. The probability is given correctly
by Rule 2 as 5/7, because these five outcomes are mutually exclusive. But
we might try to shortcut the solution by saying ''The probability that one
of the first four letters is drawn is 4/7 and the probability that a vowel is
201
drawn is 2/7. Therefore, by Rule 2, the probability that one or the other
of these happens is 6/1." This, you will note, is the wrong answer.
In leading up to the binomial distribution we bave to consider tbe
results of repeated drawings from a population. The successive trials or
drawings are assumed indepentknt of one anotber. Tbis term means that
tbe outcome of a trial does not depend in any way on what happens in
the other trials.
With a series of trials the easier problems can again be solved by
Rules I and 2. For example, a bag contains the letters a, b, c. In trial I a
ball is drawn after thorough mixing. The ball is replaced, and in trial 2
a ball is again drawn after thorough mixing. What is the probability that
both balls are a? First, we list all possible outcomes of the two trials.
These are (a, a), (a, h), (a, c), (h, a), (b, b), (b, c), (e, a), (e, b), (e, c), where
the first letter in a pair is the result ofttiall and tbe second that of trial 2.
Then we claim that these nine outcomes of the pair of trials are equally
likely. Challenged to support this claim, we might say: (i) a, b, and
are equally likely at the first draw, because of the thorough mixing, and,
(ii), at the second draw, the conditions of thorough mixing and of inde-
pendence make all nine outcomes equally likely. The probability of (a, a)
is therefore 1/9.
Similarly, suppose we are asked the probability that the two drawings
contain no c's. This event is satisfied by four mutually exclusive out-
comes: (a, a), (a, h), (b, a), and (b, h). Consequently, the probability
(by Rule 2) is 4/9.
Both the previous results can be obtained more quickly by noticing
that the probability of the combined event is the product of the prob-
abilities of the desired events in the individual trials. In the first problem
the probability of an a is 1/3 in the first trial and also 1/3 in the second trial.
The probability that both events happen is 1/3 x 1/3 = 1/9. In the second
problem, the probability of not drawing a cis 2/3 in each individual trial.
The probability of the combined event (no c at either trial) is 2/3 x 2/3
= 4/9. A little rellection will show that the numerator of this product
(I or.4) is the number of equally likely outcomes of the two drawings that
satisfy the desired combined event. The denominator, 9, is the total
number of equally likely outcomes in the combined trials. The prob-
abilities need not be equal at the two drawings. For example, the probabil-
ity of getting an {I at the first trial but not at the second is 1/3 x 2/3 = 2/9,
the outcomes that produce this event being (a, b) and (a, c).

Rule 3. (The Multiplication Rule). In a series of independent trials.


the prObability that each of a specified series of events happens is the
product of the probahilities of the individual events.
I n mathematical terms,
202 C"""Ier 8: Sampling From th. Binomial Distribution
In practice. the assumption that trials are independent, like the as-
sumption that outcomes are equally likely. must be justified by knowledge
of the circumstances of the trials. In complex probability problems there
have been disputes about the validity of these assumptions in particular
applications. and some interesting historical errors have occurred.
This account of probability provides only the minimum background
needed for working out the binomial distribution. Reference (1) is recom-
mended as a more thorough introduction to this important subject at an
elementary mathematical level.
EXAMPLE 8.2.J-A bag contains the Jetters..4, b, c, D, e,f, G, h. 1. If each leiter is
equally likely to be drawn. what is the probability of drawing: (i) a capital letter, (ii) a vowel,
(iii) either a capital or a vowel. Ans. (i) 4/9, (ii) 1/3, (iii) 5/9. Does Rule 2 apply to the two
events mentioned in (iii)?
EXAMPLE 8.2.2-Three bags contain, respectively, the letters at b; c, d, e;[, g, h, i.
A letter is drawn independently from each bag. Write down aJl24 equally likely outcomes of
the three drawings. Show that six of them give a consonant from each bag. Verify that
Rule 3 gives the correct probability of drawing a consonant from each bag 0/4).
EXAMPLE 8.2.3-Two six-sided dice are thrown independently. Find the probability:
(i) that the first die gives a 6 and the second at least a 3, (il) that one die gives a 6 and the
other at least a 3, (iii) that both give at least a 3, (iv) that the sum of the two scores is not
more than 5. Ans. (i) IJ9. (ii) 2/9. (iii) 4/9. (iv) 5/18.
EXAMPLE 8.2.4-From a ba8 with the letters a, b, c, d, e a letter is drawn and laid
aside, then a second is drawn. By writing down all equally likely pairs of outcomes, show
that the probability that both letters are vowels is 1/10. This is a problem to which Rule 3
does not apply. Why not?
EXAMPLE 8.2.5-If two trials are not independent, the probability that event El
happens at the first trial and E1 at the second is obtained (1) by a generalization of Rule 3;
P(EJ and E2 ) = P(EJ)P(El , given that EJ has happened). This last factor is called the condi-
tional probability of E2 given £J' and is usually written P(E 2jE1). Show that this rule gives
the answer. 1(10. in example 8.2.4, where E 1• E2 are the probabilities of drawing a vowel at
the first and second trials, ·respectively.

In many applications, the probability of a particular outcome must


be determined by a statistical study. For instance, insurance companies
are interested in the probability that a man aged sixty will live for the next
ten years. This quantity is calculated from national statistics of the age
distribution of males and of the age distribution of deaths of males. and
is published in actuarial tables. Provided that the conditions of inde-
pendence and of mutually exclusive outcomes hold where necessary. Rules
2 and 3 are applied to probabilities of this type also. Thus. the probability
that three men aged sixty, selected at random frem a population, will all
survive for ten years would be taken as p3. where p is the probability that
an individual sixty· year-old man will survive for ten years.

8.3-The binomial distribution. A proportion p of the members of a


population possess some attribute. A sample of size n = 2 is drawn. Tbe
result of a trial is denoted by S (success) if the member drawn has the
attribute and by F (failure) if it does not. In a single drawing, p is the
203
TABLE 8.3.1
THB BINOMIAL DIsTRIBUTION FOIl n= 2

(I) (2) (3) (4)


Outcomes of Trial No. of
I 2 Probability SlWCCSOCS Probability

F F 99 0 tI
F

s
S

F :} 2pq

S S pp 2 p'

Total

probability of obtaining an S, while q = I - P is the probability of ob-


taining an F. Table 8.3.1 shows the four mutually exclusive outcomes of
the two drawings, in terms of successes and failures.
The probabilities given in column (2) are obtained by applying
Rule 3 to the two trials. For example, the probability of two successive
F's is qq, or q'. This assumes, of course, that the two trials are inde-
penoent, as is necessary if the binomial distribution is to hold. Coming
to the third column, we are now counting the number of successes. Since
the two middle outcomes, FS and SF, both give I success, the probability
of I success is 2pq by Rule 2. The third and fourth columns present the
binomial distribution for n = 2. As a check, the probabilities in columns
2 and 4 each add to unity, since
q' + 2pq + p' = (q + p)' = (1)2 = I

TABLE 8.3.2
THE BINOMIAL DISTRibUTION FOR n = 3

(I) 12) (3) (4)


Outcomes of Trial No. of ,
I 2 3 Probability Successes Probability

F F F qqq 0 q'

F F S qqp }
F S F qpq 3pq'
S E F pqq

F
s F
S S
S <lPP}
pqp , 3p2q
S S F ppq

S oS S ppp 3 p"
204 Chopl., 8: Sampling From 1/tQ 8i"..",iol Oislribulioft
In the same way, table 8.3.2 lists the eight relevant outcomes for n = 3.
The probabilities in the second and fourth columns are obtained by Rules
3 and 2 as before. Three outcomes provide 1 success, with total prob-
ability 3pq2, while three provide 2 successes with total probability 3p2q.
Check that the eight outcomes in the first column are mutually exclusive.
The general structure of the binomial formula is now apparent. The
formula for the probability of r successes in n trials has two parts. One
part is the term p'if -, This follows from Rule 3, since any outcome of
this type must have r S's and (n - r) F's in the set of n draws. The
other part is the number of mutually exclusive ways in which the r S's
and the (n - r) Fs can be arranged. In algebra this term is called the
number of combinations of, letters out of n letters. It is denoted by the
symbol G). The formula is
n) ~ n(n - l)(n - 2) ... (n - , + 1)
(r ,(r - 1)(, - 2) ... (2)(1)

For small samples these quantities, the binomial coefficients, can be


written down by an old device known as Pascal's triangle, shown in table
8.3.3.
Each coefficient is the sum of the two just above it to the right and
the left. Thus, for n = 8, the number 56 = 21 + 35. Note that for any
n the coefficients are symmetrical, rising to a peak in the middle.
Putting the two parts together, the probability of r successes in a
sample of size n is
n) , "_, _ n(n - l)(n - 2) ... (n - , + 1) _,
(, pq - ,(, - l)(r - 2) ... (2)(1) p'q"

These probabilities are the successive terms in the expansion of the bi-
nomial expression (q + p)". This fact explains why the distribution is
called binomial, and also verifies that the sum of the probabilities is I,
since (q + p)" = (l)" = 1.
TABLE 8.3.3
BINOMIAL COEFFICIENTS GIVEN BY PASCAL'S TRIANGLE

Size of Sample Binomial Coefficients

n
I
2 2
3 3 3
4 4 6 4
5 5 10 10 5
6 6 15 20 15 6
7 7 21 35 35 21 7
8 8 28 56 70 56 28 8
eOC.
205

~:l, I I
p' 0.2
n" 8

~ ;:1
3 4 5 8

0.0
0
,.,
o
'I I
---I--_JI---+--+---I---+--+--__"
234
I
567 8
0.5 p' 0.9
n" 8
0.4
0.3
0.2
0.1

0.0
o 2 3 4 5 6 7 8
Number of Successes
FIG. 8.3. I-Binomial distributions for n = 8. Top: p = 0.2.
Middle: p = 0.5, Bottom: p = 0.9.

For n = 8, figure 8.3.1 shows these distributions for p = 0.2,0.5. and


0.9. The distribution is positively skew for p less than 0.5 and negatively
skew for p greater than 0.5. For p = 0.5 the general shape. despite the
discreteness, bears some resemblance to a normal distribution.
Reference (16) contains extensive tables of individual and cumulative
terms of the binomial distribution for n up to 49: reference (\7) has cumu-, .--
lative terms up to n = 1,000.
8,4-Sampling the binomial distribution. As usual, you will find it
instructive to verify the preceding theory by sampling. The table of
random digits (table A I, p. 543) is very convenient for drawing samples
from the binomial with n = 5, since the digits in a row are arranged in
groups of 5. For instance, to sample the binomial with p = 0.2, let the
digits 0 and I represent a success, and all other digits a failure. By record-
ing the total number of O's and I's in each group of 5, many samples
from n = 5, p = 0.2 can be drawn quickly. Table 8.4.1 shows the results
of 100 drawings of this type. and illustrates a common method of tallying
the results. A slanting line is used at every fifth tally. so that repre- W11
St"nJs 5 drawings of a partjcular number of successes.
To fit the corresponding theoretical distribution. first calculate the
terms p'q"-'. For r = a (no successes) this is q" = (0.8)' = 0.32768. For
r = I, it is pq"-' = (0.2)(0.8)4. To obtain a shortcut, notice that this term
206 Cbapter 8: Sampling from tbe Binomial Distribution
TABLE 8.4.1
TALLYING OF 100 DRAWINGS FIlOM. TliE BINOMIAL WITH n "'" S,p "'" 0.2

No. of
Successes Total

o u-n u-n I.-H1 I.-H1 u-n II )2


1 u-n u-n I.-H1 u-n u-n I.-H1 u-n IIII 44
2
)
u-n u-n II 17
6
I
4 I
5 o
100

cal] be written: (q")(P/q). It is computed from the previous term by


multiplying by p/q = 0.2/0.8 = 1/4. Thus for r = 1 the term is
(0.32768)/4 = 0.08192. Similarly, the term for r = 2, p'q"-', is found by
multiplying the term for r = 1 by (P/q), and so on for each successive term.
The details appear in table 8.4.2. The binomial coefficients are read
from Pascal's triangle. These coefficients and the terms in p'q"-' are
multiplied to give the theoretical probabilities of 0, I, 2, ... 5 successes.
Finally, since N = 100 samples were drawn, we multiply each probability
by 100 to give the expected frequencies of 0, I, 2, ... 5 successes.

TABLE 8.4.2
FITTTho'G THE THEoRETICAL BINOMIAL FOil n == 5, p = 0.2

No. of
Successes (r) I Term
P'<r'
Binomial
Coefficient
(;)p'r' Expected
Frequency
Observed
Frequency

0 0.32768 I 0.32768 32.77 32


I 0.08192 5 0.40960 40.96 44
2 0.02048 10 0.20480 20.48 17
3 0.00512 10 0.05120 5.12 6
4 0.00128 5 0,00640 0.64 I

- -r 5 0.0(0)2 I 0.00032 0.03 0

1.00000 100.00 100


--
Because of sampling variation, the expected and observed frequencies
do not agree exactly, but their closeness is reassuring. Later (section 9.4)
a method is given for testing whether the observed and expected fre-
quencies differ by no more than is usual from sampling variation. In the
present example, the agreement is in fact better than is usually found in
such sampling experiments (example 9.4.1).
EXAMPLE 8.4.l-With n = 2, P == 1/2, show that the probability of onc success is
1/2. If p differs from 1/2. does the probability of one success jncrease or decrease?
201
EXAMPLE 8.4.2-A railway company claims that 95% of its trains arrive on time, Ifa
man travels on three of these trains, what is the probability that: (i) all three arrive on time,
(ii) one of the three is late. assuming that the claim is correct. Ans. (i) 0.851, (ti) 0.135.
EXAMPLE 8.4.~Assuming that the probability that a child is male is 1/2, find the
probabiJity that in a family of 6 children there are: {i) no boys, (ii) exactly 3 boys, (iii) at
least 2 girls, (iv) at least on. girl and 1 boy. Ans. (i) 1/64, (ii) 5/16, (iii) 57/64, (iv) 31/32.
EXAMPLE 8.4.4-Work out the terms of the binomial distribution for n = 4, p == 0.4.
Verify that: (i) the sum of the terms is unity. Oi) I and 2 successes are equally probable,
(ill) 0 successes is about five times as probable as 4 successes.
EXAMPLE 8.4.5-B), extending Pascal's triangle, obtain the binomial coefficients for
11'"10. Hence compute and graph the binomial distribution forn = IO,p = 1/2. Does the
shape appear similar to a normal distribution'! Hint: when p "" 1/2, the teon P'lf - r == 1/2"
for any r. Since 2 10 :a: 1,024 == 1,000, the distribution is given accurately enough for
graphing by simply dividing the binomial coefficients by 1,000.

8.S-Mean IUId standard denadon of the binomial distribution, If

}; = n(n - 1) ... (n - r + 1) p'q"_'


r(r - 1) ... (2)(1)
denotes the binomial probability of r successes in a sample of size n, the
mean and variance of the distri bution of the number of successes rare
defined by the equations

,,, = L• (r - Jl.)lt. (8,5.1)


,-0
Note the formula for ,,1. In a theoretical distribution, ,,, is the average
value of the squared deviation from the population mean. Each squared
deviation, (r - Jl.)l. is multiplied by its relative frequency of occurrence};.
The concepl of number of degrees of freedom does not come in.
By algebra. it is found from (8.5.1) thai

Jl. = np : 17' = npq : 17 = .Jiiiii 18.S.l}


These results apply to the lIUmber of successes. Often, interest centers in
the proportion of successes, r/n. For this,

Jl. = p : 17 ' = pq/n : 17 = Jpq/n (8.5.3)


Sometimes results are presented in terms of the percentage of sua:esses
l00rln. Formulas (8.5.3) also hold for the percentage of successes if p now
stand& for the percentage in the population and q = 100 - p.
As illustrations. the formulas work out as follows for n = 64, p = 0.2:
Number: Jl. = (64)(0.2) = 12.8 : ,,= J{(64)(O.2)(O.8l} = JlO.24 = 3.2
Proportion: Jl. = 0.2 : ,,= J{(0.2)(O.8)/64} = JO.0025-0.os
Percentage: Jl. = 20 : ,,= J i (20)(80)/64) = J 25 - S
208 CIoopl., 8: Somp/ift!l From ,''' Binomial DirtriI>ulioft
For a sample of fixed size n, the standard deviations .[iiiiij for the
number of successes and ffqin for the proportion of successes are greatest
whenp = 1/2. Asp moves towards either 0 or I, the standard deviation
declines, though quite slowly at first, as the following table of Jpq shows.

p 0.5 0.4 or 0.6 0.6 or 0.7 0.2 or 0.3 0.1 or 0.9

0.500 0.490 0.458 0.400 0.300

EXAMPLE S.5.I-Forthe binomial distribution of the number of successes with n =2


{g,ven '" tab1e 8.3.1, p. 203), verify from formulas 3.5.1 that J1 = 2p, cJ2 = 2pq.
EXAMPLE 8.S.2-For the binomial distribution with n = 5, p -= 0.2, given in table
IN!. and verify that the results are J1 "'" 1 and (Jl = O.SO.
8.4.2, compute 'f.rfr and I.(r -
EXAMPLE 8.S.3-For n = 96,p = 0.4. calculate the S.D.'s of: (i) the number. (ii) the
percentage of successes . .Ans. (i) 4.8, Oi) 5.
EXAMPLE 8.S.4-An investigator intends to estimate. by random sampling from a
large file of house records, the percentage of houses in a town that have been sold in the
last year. He thinks thatp is about 10% and would like the standard deviation of his esti-
mated percentage to be about I %, How large should n be'? ADS, 900 houses.
There is an easy way of obtaining the results JJ. = p and ,,' = pqln for
the distribution of the proportion of successes rln in a sample of size n.
Attach the number 1 to every success in the population and the number 0
to every failure. Instead of thinking of the population as a large collection
of the letters Sand F, we think of it as a large collection of I's and O's.
It is the population distribution ofa variable X that takes only two values:
I with relative frequency p and 0 with relative frequency q. The popula·
tion mean and variance of the new variate X are easily found by working
out the definitions (8.5.1),
JJ.x = 'I:.XJx "I = 'I:.( X - JJ.)'Jx
where the sum extends only over the two values X = 0 and X = I, as
shown below:

x I. XI. X-p (X - PI' (X - pl'!x

o q o -p p'q
I p p I-p q'p

PI"'" P

The variate X has population mean p and population variance pq.


Now draw a random sample of size n. If the sample contains r suc-
cesses, then 'I:.X, taken over the sample, is r, so that X = 'I:.X/n is rln, the
sample proportion of successes. But we know that the mean of a random
sample from any distribution is an unbiased estimate of the population
mean, and has variance ,,'In (section 2.11). Hence X = 'In is an unbiased
estimate of p, with variance "Vn = pqln.
\d.-' 209

0.2~

0.20

O.I~
p,
0.10

O.O~

0.00 10
0 2 4
Number 01 Successes
FlG. S.6.1-The solid vertical lines show the binomial distribution of the number of suc-
cesses for n = 10, P = 0.5. The curve is the normal approximation to this distribution,
which has mean np = S and S.D . .j(npq) = I.S8!.

Further, since X = rln is the mean of a sample from a population


that has a finite variance pq, we can quote the Cen~ral Limit Theorem
(section 2.12). This states that the mean X ofa random sampie from any
population with finite variance tends to normality. Hence, as n increases,
the binomial distribution of rln or of r approaches the normal distribution.
For p = 0.5 the normal is a good approximation when n is as low as
10. As p approaches 0 or I, some skewness remains in the binomial
distribution until n is large.
8.6--The normal approximation and the correction for continuity. The
solid vertical lines in figure 8.6.1 show the binomial distribution of r for
" = 10, P = O.S. Also shown is the approximating normal curve, with
mean np = 5 and S.D. JiiPfi = I.S81. The normal seems a good ap' -
proximation to the shape of tbe binornial.
One difference,' however, is that the binomial is discrete, baving
probability only at the values r = 0, 1,2, ... 10, while the normal has
probability in any interval from - ~ to 00. This raises a problem:
in estimating the binomial probability of, say, 4 successes, what part of
the normal curve do we use as an approximation? We need to set up a
correspondence between the set of binomial ordinates and the areas under
the normal curve.
The simplest way of doing this is to regard the binomial as a grouping
of the normal into unit class intervals. Under this rule the binomial
ordinate at 4 corresponds to the area under the normal curve from 3t
to 41. The ordinate at S corresponds to the area from 4t to st and SO
on. The ordinate at 10 corresponds to the normal area from 9t to cD.
These class boundaries are the dotted lines in figure 8.6.1.
In the commonest binomial problems we wish to calculate the prob-
210 Chopl., 8: Sampling From lhe Binomial Oi."iblllion
abilities at the ends of the distribution; for instance, the probability of 8
or more successes. The exact result, found by adding the binomial prob-
abilitie, for, = 8, 9, 10, is 56/1024 = 0.0547. Under our rule, the cor-
responding area under the normal curve is the area from 7t to 00, not
the area from 8 to 00. The normal deviate is therefore z = (7.5 - 5)/1.581,
which by a coincidence is also 1.581. The approximate probability from
the normal table is P = 0.0570, close enough to 0.0547. Use of
z = (8 - 5)/1.581 gives P = 0.0288, a poor result.
Similarly, the probability of4 or fewer successes is approximated by
the area of the normal curve from -00 to 4t. The general rule is to de-
crease the absolute value of (, - np) by t. Thus,
z, = (IT - npi - t>lJ(npq)
The subtraction of t is called the correction fo, continuity. It is simple to
apply and usually improves the accuracy of the normal approximation,
although when n is large it has only a minor effect.
If you are working in terms of Tin instead of " then
z = I_n_-.,.;.p,,-I-;--..,.:ic..:/2:::n
1,-,'
, J(Pqln)
EXAMPLE 8,6.I-For n = 10. p = 1!2. calculate: (i) the cxact probability of 4 or
fewer successes, and the normal approximation, (U) corrected for continuity, (iii) uncor·
reeted. An•. (i) 0.377, (ii) 0.376, (iii) 0.263.
EXAMPLE 8.6.2-10 a sample ofsi7.e 49 withp = 0.2. the expected number of successes
is 9.8. An investigator is interested in the probability that the observed number of successes
will be (i) 15 or more, or (ii) 5 or less. Estimate these tWO probabilities by the corrected
normal approximation. Ans. (i) 0.0466 (ii) 0.0623. The exact answers by summing the
binomial are: (i) 0,0517, (ii) 0.0547. Because of the skewness (p = 0.2). the normal curve
underestimates in the long tail and overestimates in the short tail. For the sum of the two
tails the normal curve does beuer, giving 0.1089 as against the exact 0.1064,
EXAMPLE 8.6.3-With n = 16., = 0.9, estim,are by the normal curve the probability
tlfat 16 successes are <Jbtained. The exact result is, of course. (0.9)16 = 0.185. An'. 0,180.

8.7-Confide";;. limits for. proportion. If, members out of a sample


of size n are found to possess some attribute. the sample estimate oC the
proportion in the population possessing this attribute is p = ,111. In large
samples, as we have seen, 'the binomial estimate p is approximately
normally distributed about the population proportion p with standard
deviation ,j(pq/n). For the true but unknown standard deviation j(pq/n)
we substitute the sample estimate J(pqln). Hence, the probability is
approximately 0.95 that p lies between the limits
p - 1.96.J(Ptl/II) and p + 1.96J(pqln)
But this statement is equivalent to saying that p lies between
fJ - 1.96J(pqfn) and p + 1.96.J(Mln) (8.7.1)
unless we were unfortunate In drawing one of the extreme samples that
211
turns up once in twenty times. The limits 8.7.1 are therefore the ap-
proximate 95% confidence limits for p.
For example. suppose that 200 individuals in a sample of 1,000
possess the attribute. The 95% confidence limits for pare
0.2 ± I.96J(O.2)(O.8)/1000 = 0.2 ± 0.025
The confidence interval extends from 0.175 to 0.225; that is, from 17.5%
to 22.5%. Limits corresponding to other confidence probabilities are of
course obtained by inserting the appropriate values of the normal deviate
z. For 99% limits, we replace 1.96 by 2.576.
If the above reasoning is repeated with the correction for continuity
included, the 95% limits for p become
Il ± {1.96J(Il4/n) + 1/2n}
The correction is easily applied. It amounts to widening the limits a
little. We recommend that the correction be used as a standard practice,.
although it makes little difference when n is large. To illustrate the cor-
rection in a smaller sample, suppose that 10 families out of 50 report
ownership of more than one car, giving Il = 0.2. The 95% confidence
limits for pare
0.2 ± {1.96JO.l6/50 + 0.01} = 0.2 ± 0.12,
or .08 and .32. More exact limits for this problem, computed from the
binomial distribution itself, were presented in table 1.4.1 (p. 6) as 0.10
and 0.34. The normal approximation gives the correct width of the
interval, 0.24, but the normal limits are symmetrical about Il, whereas the
correct limits are displaced upwards because an appreciable amount of
skewness still remains in the binomial when n = 50 and p is not near 1/2.
If you prefer to express /l and p in percentages, the 95% limits are
/l ± {1.96Jil(IOO - il)/n + 50/n}
You may verify that this formula gives 8% and 32% as the limits in the
above problem.
8.11-Test of sigDilkance of a bioomial proportion. The normal ap-
proximation is useful also in testing the null hypothesis that the population
proportion of successes has a known value p. If the null hypothesis is
true, p is distributed approximately normally with mean p and S.D.
,j(pq/n). With the correction for continuity, the normal deviate is
t, = (Ill - pi - 1/2n)I,j(pq/n)
= (Ir - npl - tl!,j(npq)
This can be referred to the normal tables to compute the probability of
getting a sample proportion as divergent as the observed one.
To take an example considered in chapter I, a physician found 480
men and 420 women among 900 admitted to a hospital with a certaie.
212 Chapter 8: Sampling from the Binomial Distribution

disease. Is this result consistent with the hypothesis that in the population
of hospital patients, half the cases are male? Taking r as the number of
males,

Z = 1480 - 450 1 - ! = 29.5 = 1967


, y' {(900)(t)(tll 15 .
Since the probability is just on the 5% level, the null hypothesis is rejected
at this level.
If the alternative hypothesis is one-tailed, for instance that more than
half the hospital patients are male, only one tail of the normal distribution
is used. For this alternative the null hypothesis in the example is rejected
at the 2t% level.
In sections 1.10--1.12 you were given another method of testing a null
hypothesis about p by means of chi-square with I degree of freedom. In
the notation of chapter I,

1 =" (Obs. - Exp.)' =" (f - F)' ,


X £... Exp. £... F .

the sum being taken over the two classes, male and female. The X' test
is exactly the same as the two-tailed z test, except that the above formula
for X2 contains no correction for continuity. To show the relationship,
we need to translate the notation of chapter I into the present notation, as
follows:
Notation of Chapter 1 Present Notation

Class
Males Females

Oboetvcd nos. : f , n-,


Expected nos. : F nq="-ttp
Obo. - Exp. f-F -(r - lIP)

Hence,
x' = L (f - F)' = (r - np)' + (r - np)'
F np "'l
(r - np)' ( (, - np)'
= q + p) = = Z',
npq npq
since the normal deviate z = (r - np)/y'(npq) if no correction lor continu-
ity is used. Further, the:x' distribution, with I dj., is the distribution of
the square of a normal deviate: the 5% significance level of X', 3.84, is
simply the square of I .96. Thus, the two tests are identical.
To correct X' for continuity, we use the square of z, corrected for
continuity.
213

1./ = (/r - npi - W


npq

As with:, we recommend that the correction he applied routinely. For


one-sided alternatives the =method is preferable, since I.' takes no a~­
count of the sign of (r - np) and is basically two-sided.
EXAMPLE 8.8.1--Two workers A and B perform a task in which carelessness leads to
minor accidents. In the firs( 20 acclden's. 13 happened to A and 7 to B. In a previous ex~
ampl~ (1.15.1) you were asked to calculate X2 for testing the null hypothesis that A and B
are equally likely to have accidents. the answer being X2 = 1.8. with P about 0.18. Re~
calculate X2 and P, corrected for continuity. Ans. 1./ = 1.25, P slightly greater than 0.25.
EXAMPLE 8.8.2~A question that is asked occasionally is whether the 1/2 correction
should be applied in ,~ if )r - npl is less than 1/2. This happens for instance, if r = 6,
n = 25 and the null hypothesis is p = 1,4. because np = 6.25 and Ir - np) = 0.25. Strictly.
the answer in such cases is that the corrected value of 1. 2 is zero. When n = 25, the result
r = 6 is the: sample result that gives the closest possible agreement with the null hypothesis.
np = 6.25. Hence, all possible samples with n = 25 give results at least as divergent from the
null hypothesis. The significance P is therefore I, corresponding to X2 = O.

8.9-The comparison of proJNlrtions in paired samples. A comparison


of two sample proportions may arise either in paired or in independent
samples. To ,illustrate paired samples,'suppose that a lecture method is
heing compared with a method that uses a machine for programmed
learning but no lecture, the objective heing to teach workers how to per-
form a rather complicated operation. The workers are first grouped into
pairs by means of an initial estimate of their aptitudes for this kind of task.
One memher of each pair is assigned at random to each method. At the
end, each student is tested to see whether he succeeds or fails in a test on
the operation.
With 100 pairs, the results might he presented as follows:
Result for Method
A B No. of Pairs

s S 52
S F 21
F S 9
F F 18

TOIaI 100

In 52 pairs, both workers succeeded in the test; in 21 pairs, the


worker taught by method A succeeded, but his partner taught by method
B failed, and so on.
As a second illustration (2), different mellia for growing diphtheria
bacilli were compared. Swabs were taken from the throats of a large
numher of patients with symptoms suggestive of the presence of diphtheria
bacilli. From each swab, a sample was grown on each medium. After
allowing time for growth, each culture was examined for the presence or
214 C"apt.r 8: Sampling From ,It. Binomial Distribution
absence of the bacilli. A successful medium is one favorable to the
growth of the bacilli so that they are detected. This is an example of self-
pairing, since each medium is tested on every patient. It is also an example
in which a large number of FFs would be expected, because diphtheria
is now rare and many patients would actually have no diphtheria bacilli in
their throats.
Consider first the test of significance of the null hypothesis that the
proportion of SUccesses is the same for the two methods or media. The
SS and FF pairs are ignored in the test of significance, since they give no
indication in favor of either A or B. We concentrate on the SF and FS
pairs. If the null hypothesis is true, the population must contain as many
SF as FS pairs. In the numerical example there are 21 + 9 = 30 pairs of
the SF or FS type1;. Under the null hypothesis we expect 15 of each type
as against 21 and 9 observed.
Hence, the null hypothesis is tested by either the X2 or the z test of
the preceding section. (In the z test we take n = 30, r = 21, P = 1/2).
When p = 1/2, ~2 takes the particularly simple form (section 5.4),

2 = <121 - 91 - 1)2 = 121 = 403


Xc 30 30'

with I df The null hypothesis is rejected at the 5% level (3.84). Method


A has given a Significantly higher proportion of successes. Remember
that in this test, the denominator of X2 is always !)Ie total number of SF
and FS pairs. This test is the same as the sign test (section 5.4).
The investigator will also be interested in the actual percentages of
successes given by the two methods. These were: 52 + 21 = 73% for A
and 52 + 9 = 61% for B. If the task is exceptionally difficult, he might
conclude that although A is s.ignificantly better than D, both methods are
successfui €iivilgh to be useful. In other circumstances, he might report
that neither method is sati.factory. This might be the case if A and B
were two new techniques for predicting some featur~ of the weather, and
if standard techniques were known to give !nore than 85% successes.
When there is clearly a difference between the performances of the
two methods, we may wish to report this difference. (73% - 61%) = 12"/..
along with its standard error. Let

Ps. = proportion of SF pairs = :~ = 0.21


P.. = proportion of FS pairs = I ~ = 0.09

When the difference is expressed in percentages (12%), asimple formula for


its standard error is
275

S.E. = 100 J{PSF + p,. -"(PSF - PTS)'}


= 100 J{0.21 + 0.09 ~~0.21 - O.OO)'}

= 10,.10.2856 = 5.3
If the difference is expressed in proportions, the factor 100 is omitted.
Note: If you record only that A. gave 73 successes and B gave 61
successes out of) 00, the test of significance in paired data cannot be made
from this information alone. The classification of the results for the in-
c1ividual pairs must be available.
8.IO-Comparisoo of proportions in two independent _pies: the
2 x 2 table. This problem occurs very often in investigative work. Many
controlled experiments which compare two procedures or treatments are
carried out with independent samples, because no effective way of pairing
the subjects or animals is known to the investigator. Comparison of
proportions in different groups is also common in non-experimental
studies. A manufacturer compare~ the proportions of defective articles
found in two separate sources of supply from which he buys these articles,
or a safety engineer compares the proportions of head injuries sustained
in automobile accidents by passengers with seat belts and those without
~at belts.
Alternatively, a single sample may be classified according to two dif-
ferent attributes. The data used to illustrate the calculations come from
a large Canadian study (3) of the relation between smoking and mor-
tality. By an initial questionnaire in 1956, male recipients of war pensions
were classified according to their smoking habits. We shall consider two
claMes: (i) non-smokers and (ii) those who reported that they smoked
pipes only. For any pensioner who died during the succeeding six years,
a report of the death was obtained. Thus, the pensioners were classified
also according to their status (dead or alive) at the end of six years. Since
the probability of dying depends greatly on age, the comparison given
here is confined to men aged ~ at the beginning of the study. The
numbers of men falling in the four claMeS are given in tahle 8.10.1, called
a 2 x 2 contingency table.
It will be noted that ) 1.0"1. of the non-smokers had died, as against
13.4% of the pipe smokers. Can this difference be attributed to sampling
error, or does it indicate a real difference in the death rates in the two
groups? The null hypothesis is that the proportions dead, 117/1067 and
54/402, are estimates of the same quantity.
The test can be performed by X'. As usual,

X' = L (f - F)2,
F
216 Chapter 8: Samplillll From the Binomial Dislrib~tio.

TABLE 8.10.1
MEN CLASSIFIED BY SMOK.ING HABIT AND MORTALITY IN SIX YEARS

Non-smokers Pipe Smokers Total

Dead 117 54 171


Alive 950 348 1,298

Total 1,067 402 1,469

% dead 11.0 13.4

where the!'s are the observed numbers 117,950,54,348 in the four cells.
The Fs are the numbers that would be expected in the four cells if the
null hypothesis were true.
The Fs are computed as follows. If the proportions dead are the
same for the two smoking classes, our best estimate of this proportion is
the proportion, 171/1469, found in the combined sample. Since there are
1067 non-smokers, the expected number dead, on the null hypothesis, is
(1067)( 171 1= 242
1469 I..

The rule is: to find the expected number in any cell, multiply the cor-
responding column and row totals and divide by the grand total. The
expected number of non-smokers who are alive is
(1067)( 1298) = 942.8
'1469 '
and so on. Alternatively, having calculated 124.2 as the expected number
of non_smokers who are dead, the expected number alive is found more
easily as 1067 - 124.2 = 942.8. Similarly, the expected number of pipe
smokers who are dead is 171 - 124.2 = 46.8. Finally, the expected num-
ber of pipe smokers who are alive is 402 - 46.8 = 355.2. Thus, only one
expected number need be calculated; the others are found by subtraction.
The observed numbers, expected numbers, and the differences (/ - F)
appear in table 8.10.2.
Except for their signs, all four deviations (f - F) are equal. This result
holds in any 2 x 2 table.
TABLE 8.10.2
VAL.UES OF /lOBSERVEO), F(EXPECTEO), AND (f - F) IN THE FoUlt CELlS

f F f-F
117 54 124.2 46.8 -7.2 + 7.2

950 )48 942.8 355.2 +7.2 -7.2


217
Since (/ - F)' is the same in all cells, X' may be written
, ,~ I
X = (f - F) L.-- (8.10.1)
i=1 Fi

72 ,(-I- - --+-- I I I)
=
+-
( .) 124.2 46.8 + 942.8 355.2
= (51.84)(0.0333) = 1.73
A table of reciprocals is useful in this calc\Jlatio n. since the four reciprocals
can be added directly.
How many degrees of freedom has X2 ? Since all four deviations are
the same except for sign. this suggests that X' has only I dI. as was
proved by Fisher. With I d,r, table A 5 shows that a value of x' greater
than 1.73 occurs with probability about 0.20. The observed difference
in proportion dead between the non-smokers and pipe smokers may well
be due to sampling errors.
The above X' has not been corrected for continuity. A correction is
appropriate because the exact distribution ofX 2 in a 2 x 2 table is discrete.
With the same four marginal totals, the two sets of results that are closest
to our observed results are as follows:
(i) (ii)
118 53 171 116 55 171
949 349 1298 951 347 1298
1067 402 1067 402
/- F= ±6.2 / - F = ±8.2
Since the expected values do not change. the values If- F) are ±6.2 in
(i) and ± 8.2 in (ii). as against ± 7.2 in our data. Thus, in the exact dis-
tribution of X' the values of If - FI jump by unity. The correction for
"ontinuity is made by deducting 0.5 from If - Fl. The formula for cor-
rected Xl is
x/ = (If - FI - O.5)'l:I/F, (8.10.2)
= (6.712 (O.03J3) = 1.49
The corrected P is about 0.22. little changed in this example because the
samples are large. In small samples the correction -nl(lkes a substantial
difference.
Some workers prefer an alternative formula for computing X2 . The
2 x 2 table may be represented in this way:
a h a + h
c d c+d
a+(' h+d, N=a+b+('+d
. , _ N(lad - bel - N12)'
(8.10.3)
X, - (a + b)(c + d)(a +"~)(b + d)
218 Chapter 8: Sampling From the Binomial Dillribufion
The subtraction of NI2 represents the correction for continuity.
In interpreting the results of these X2 tests in non-experimental
studies, caution is necessary, particularly when X2 is significant. The two
groups being compared may differ in numerous ways, some of which
may be wholly or partly responsible for an observed significant difference.
For instance, pipe smokers and non-smokers may differ to some extent
in their economic levels, residence (urban or rural), and eating and drink-
ing habits, and these variables may be related to the risk of dying. Before
the investigator,can claim that a significant difference is caused by the
variable under study, it is his responsibility to produce evidence that
disturbing variables of this type could not have produced the difference.
Of course, the same responsibility rests with the investigator who has done
a controlled experiment. But the device of randomization, and the greater
flexibility which usually prevails in controlled experimentation, make it
easier to ensure against misleading conclusions from disturbing influences.
EXAMPLE 8.lD.I-In a stud) as to whether cancer of the breast tends to "run in
families," Murphy and Abbey (4) investigated the frequency of breast cancer found in rela-
tives of (0 women with breast cancer. (ii) a comparison group of women without breast
cancer. The data below, slightly altered for easy calculation, refer to the mothers of the
subjects.

Breast Cancer in Subject


Yes No Total

Breast Cancer Yes 7 3 10


in Mother No 193 197 390
Total 200 200 400

Calculate X' and P (i) without correction, (ii) with correction for continuity. for testing the
null hypothesis that the frequency of cancer in mothers is the same in the two classes of
subjects. Ans. (i) Xl = 1.64, P = 0.20 (ii) x/ :: 0.92, P = 0.34. Note that the correction
for continuity always increases P, that is, makes the difference less significant.

EXAMPLE 8.1O.2~ln the previous example, verify tbat the alternative fonnula
8.10.3 for l.f l_gives the same result, by showing that Xt lin 8.10.3 comes out as 12/13 = 0.92.
EXAMPLE "g-,-lP.3-Dr. C. H. Richardson bas furnished the following numbers of
aphids (Aphis tumicis. L) dead and alive after spraying with two concentrations of solutions
of sodium oleate:

Concentration of Sodium Oleate


(percentage)
0.65 1.10 Total

Dead 55 62 117
Alive 13 3 16
Total 68 65 133
Per Cent Dead 80.9 95.4

Has the higher concentration given a significantly different per cent kill? Ans. Xc 1 = 5.31,
P < 0.025.
279
EXAMPLE 8.10.4-10 examining the effects of sprays in the control of codling moth
injury to apples, Hansberry and Richardson (5) counted the wormy apples on each of 48
trees. Two trees sprayed with the same amount of lead arsenate yielded:
A: 2,130 apples, 1,299 or 61% ofwbich were injured
B: 2.190 apples, 1,183 or 45% ofwhlch were injured
Xl = 21.16 is conclusive evidence that the cb3.nce of injury was different in these two trees.
This r~ult is characteristic of spray experiments. For some unknown reasons, injuries
under identical experimental treatments differ significantly. Hence it is undesirable to
compare sprays on single trees. because a difference in percentage of injured apples might be
due to these unknown sowces rather than to the treatments. A statistical determination of
the homogeneity or heterogeneity of experimental material under identical conditions,
sometimes called a lest of technique. is often worthwhile, particularly in new fields of research.
EXAMPLE 8,10.S~rove that fonnulas 8.10.2 and 8.10.3 for X} are the same, by
showing that

If - FI = lad - bcl/N
1:(1/1) = N'/(a + b)(c + d)(a + c)(b + d)
8.11-Test of the independence of two attributes. The preceding test
is sometimes described as a test of the independence of two attributes. A
sample of people of a particular ethnic type might be classified into two
classes according to hair color and also into two classes according to color
of eyes. We might ask: are color of hair and color of eyes independent?
Similarly, the numerical example in the previous section might be re-
ferred to as a test of the question: Is the risk of dying independent of
smoking habit? .
In this way of speaking, the word "independent" carries the saine
meaning as it does in Rule 3 in the theory of probability. Let P.. be the
probability that a member of a population possesses attribute A, and PB
the probability that he possesses attribute B. If the attributes are inde-
pendent, the probability that he possesses both attributes is P.<PB' Thus.
on the null hypothesis of independence, the probabilities in the four cells
of the 2 x 2 contingency table are as follows:

Attribute A
(I) (2)
Present Absent Total

Attribute B
(I) Present PAP, gAP, '-
. p,
(2) Absent PAil. qAq. q.

Total P. g, I

Two points emerge from this table. The null hypothesis can be
tested either by comparing the proportions of cases in which B is present
in columns (1) and (2), or by comparing the proponions of cases in which
A is present in rows (I) and (2). These two X' tests are exactly the same.
This is not obvious from the original expressions (8.10.1) and (8.10.2) given
for X2 and x/, but expression (8.10.3) makes it clear that the statement
holds.
220 Chapter 8: Sampling From the Binomial Distribution
Secondly, the table provides a check on the rule given for calculating
the expected number in any cell. In a single sample of size N, we expect
to find Np"pB members possessing both A and B. The sample total in
column (1) will be our best estimate of Np., while that in row (I) similarly
estimates NpB' Thus the rule, (column total)(row total)/(grand total)
gives (NfJ.)(NfJB)/N = NfJ.fJ. as required.
8.12-A test by means of the normal deviate %. The null hypothesis
can also be tested by computing a normal deviate z, derived from the
normal approximation to the binomial. The z and X' tests are identical.
Many investigators prefer the z form, because they are primarily interested
in the size of the difference PI - p, between the proportions found in two
independent samples. For illustration, we repeat the data from table
8.10.1.

TABLE 8.12.1
MEN CLASSIFIED BY SMOKING HABIT AND MORTALITY IN SIX YEARS

Sample (I) Sample (2)


Non-smokers Pipe Smokers Tota)

Dead 117 S4 171


Alive 950 348 1,298

Total n 1 = 1,067 "l "'" 402 1,469

Proportion dead p, = 0.1097 p, = 0.1343 P= 0.1164

Since fJI = 0.1097 and fl, = 0.1343 are approximately normally dis-
tributed, their difference P, - p, is also approximately normally dis-
tnl'mte<f. I'Iie vanance o(tros aii:lerence IS tne sum OI'tne (wo vanimces
(section 4.7).

'+ ' -- 1+-


V(fJ 1 - fJ 1 ) -"
- A
yl
CT..
Y 2
PI q-
_
nl
p,q,
-
n2

Under the null hypothesis, PI = p, = p, so that P, - p, is approximately


normally distributed with mean 0 and standard error

The null hypothesis does not specify the value of p. As an estimate,


we nalurally use {J = 0.1164 as given by the combined samples. Hence.
the normal deviate z is
221

p, - p, = _-;-:--_~0'c..109--'...:.7_-~0'c..134:....::.3_ _~ -0.0246
Z=-;=~~~7
Me, + ~,) J{(0.1I64)(O.8836)C~67 + ~2)} 0.DI877

= -1.31
In the normallable, ignoring the sign of z, we find P = 0.19, in agreement
with the value found by the original X' test.
To correct z for continuity, subtract 1 from the numerator of the
larger proportion (in Ihis case p,) and add t to the numerator of the
smaller proportion. Thus, instead of p, we use p, = 53.5/402 = 0.1331
and instead of p, we use p, = 117.5/1067 = 0.1101. The denommator of
Z, remains the same, giving z, = (0.1101 - 0,1331)/0.01877 = -1.225.
You may verify that, apart from rounding errors, Z2 = Xl and z/ = X(:2.
If the null hypothesis has been rejected and you wish 10 fmd confidence
limils for the population difference p, - p" the standard error of p, - p,
should be computed as

The ,',e. given by the null hypothesis is no longer valid. Often the change
is small, but it can be material if n t and n 2 are very unequal.
EXAMPLE 8,12.1-- Apply the z test and tbe =" lest to the d.ita on breast cancer gi"en
in example 8.10.1 and verify that =2. z X2 and =.2 = Xc 1, Note: when calculatmg 2 or =.
it is often more conltenien1 to express PI> P2 and p as percentages. Just remember tha['in
this event, q = toO ~ p.
EXAMPLE 8.12.2-ln 1943 a sample of about 1 in 1,000 families in Iowa was asked
about the canning of fruits or vegetables during the preceding season. Of the 392 rural
families. 378 had done canning, while of the 300 urban families. 274 had canned. Calculate
95"{, confidence limits for the difference in the percentages of rural and urban families who
had canned. ADS. 1.42"1" and 8.78%.

The preceding X' and z methods are approximate. the approximation


becoming poorer as the sample size decreases. Fisher (14) has shown
how to compute an exact test of significance. For accurate work the exact
test should be used if (i) the total sample size N is less than 20, or (ii) if N
lies between 20 and 40 and the smallest expected number is less than 5.
For those who encounter these conditions frequently, reference (15),
which gives tables of the exact tests covering these cases, is recommended.

8,13-Sample size for comparing two proportions. The question:


How large a sample do I need? is naturally of great interest to investigators.
For comparing two means, an approach thai is often helpful was given
in section 4.13, p. III. This should be reviewed carefully, since the same
principle applies to the comparison of two proportions. The approach
assumes that it is planned to make a tesl of significance of the difference
,
222 Chapt.r 8: Sampling from ,'" Binomial Distribution
between the two proportions, and that future actions will depend on
whether the test shows a significant difference or not. Consequently, if the
true difference P2 - p, is as large as some amount Ii chosen by the in-
vestigator, he would like the test to have a high probability P' of declar-
ing a significant result.
For two independent samples, formula (4.13.1) (p.113) for n, the size
of each sample, can be applied. Putb=p, -p,andun2=(p,q, +P,q,),
This gives
(8.13.1)

where Z, is the normal deviate corresponding to the significance level to


be used in the test, fJ = 2( I - P'), and Z~ is the normal deviate correspond-
ing to the two-tailed probability p. Table 4.13.1 gives (Z, + Z~)2 for the
commonest values of a and p. In using this formula, we substitute the
best advance estimate of (p,q, + P2Q2) in the nurr.erator.
For instance, suppose that a standard antibiotic has been found to
protect about 50% of experimental animals against a certain disease.
Some new antibiotics become available that seem likely to be superior.
In comparing a new antibiotic with the standard, we would like a prob-
ability P' = 0.9 of finding a significant difference in a one-tailed test
at the 5% level if the new antibiotic will protect 80% of the animals in
the population. For these conditions, table 4.13.1 gives (Z, + Z~)l as 8.6.
Hence
n = (8.6J{(50)(50) + (80)(20)}/(30)2 = 39.2
Thus, 40 animals should be used for each antibiotic.
Some calculations of this type will soon convince you of the sad fact
IMI mrs" samples an: necessary to detec'! sma!! differences between two
percentages. When resources are limited, it is sometimes wise, before
going ahead with the experiment" to calculate the probability that a sig-
nificant result will be found. Suppose that an experimenter is interested
in the values p, = 0.8, P2 = 0.9, but cannot make n > 100. If formula
(8.13.1) is solved for Z~, we find

Z~ = (P2 - p,).Jn _ Z = (0.1)(10) _ Z = 2 - Z


.j {p,q, + P2q2} , 0.5 ' ,

If he intends a two-tailed 5% test Z, '" 2, so that Z~ '" O. This gives


fJ = I and P' = I - PI2 = 0.5. The proposed experiment has only a
50-50 chance of finding a significant difference in this situation.
Formula (8.13.1), although a large-sample approximation, should be
accurate enough for practical use, since there is usually some uncertainty
about the values of p, and P2 to insert in the formula. Reference (6) gives
tables of n based on a more accurate approximation.
223
EXAMPLE 8.13.I-Ooe difficulty in estimating sample size in biological work is that
the proportions given by a standard treatment may vary over time. An experimenter has
found that his standard treatment has a failure rate lying between PI = 30010 andpl = 4()0/...
With a new treatment whose failure rate is 20% lower than the standard. what sample sizes
are needed to make P' = 0.9 in a two-tailed S% test? Ans. n = 79 when PI = 30"/0 and
n = lOS when PI == ~I.,.
EXAMPLE 8.13.2-10 planning the 1954 trial of the Salk poliomyelitis vaccine (7),
the question of sample size was critical, since it was unlikely that the trial could be repeated
and since an extremely large sample of children would obviously be necessary. Various esti-
mates of sample size were therefore made. I n one of these it was assumed that the probability
that an unprotected child would contract paralytic polio was 0.0003, or 0.03%. Ifthevaccine
was SlY'1o effective (that is, decreased this probability to O.OOOlS, or 0.015%). it was desired
to have a 90% chance of finding a 5~~ significance difference in a two-tailed test. How many
children are required? Ans. 210.000 in each group (vaccinated and unprotected}.
EXAMPLE 8.13.3-An investigator has P1 = 0.4. and usually conducts experiments
with, n := 25. In a one-tailed test at the 5% level, what is the chance of obtaining a s.ignificant
result if (i) p, - 0.5, (ii) p, - 0.6? Ans. (i) 0.18, (ii) 0.42.

8,I4-The Poisson distribution, As we have seen, the binomial dis-


tribution tends to the normal distribution as n increases for any fixed value
of p. The value of n needed to make the normal approximation a good
one depends on the values of p, this value being smallest when p = 1/2.
For p < 1/2, a general rule, usually conservative, is that the normal ap-
proximation is adequate if the mean J.' = np js greater than 15.
In many applications, however, we are studying rare events, so that
even if n is large, the mean np is much less than 15. The binomial distribu-
tion then remains noticeably skew and the normal approximation is un-
satisfactory. A different approximation for such cases was developed by_
S. D. Poisson (8). He worked out the limiting form of the binomial dis-
tribution when n tends to infinity and p tends to zero at the same time, in
such a way that Il = np is constant. The binomial expression for the
probability of r successes tends to the simpler form,

Per) = -Il' e-· , = 0, 1, 2, ... ,


r!
where e = 2.71828 is the base of natural logarithms. The initial terms in
the Poisson distribution are:
l
P(O) = e-' : P(l) = IU'-' : P(3) = (~(3) e-'
Table 8.14.1 shows in column (I) the Poisson distribution for Il = I.
The distribution is markedly skew. The mode (highest frequency) is at
either 0 or I. these two having the same probability when Il = I. To give
an idea of the way in which the binomial tends to approach the Poisson.
column (2) shows the binomial distribution for n = 100, P = 0.01, and
column (3) the binomial for n = 25, p = 0.04, both of these having
np = I. The agreement wi,h the Poisson is very close for n = 100 and
224 Chap'.' 8: Sampl;1tIJ f""" ,h. 8inomial Dislribu'ion
TA8LE 8.14.1
THE POlSSON DISTRIBUTION FOR j1. = I COMPARED WITH THE BINOMIAL
DISTRIBUTIONS FOR n = 100, p = 0.01 AND n = 25. p = 0.04

Relative Frequencies

(I) (2) (3)


poisson Binomial Binomial
I n = lOO,p = 0.01 n _ 25,p - 0.1>4

0 0.3679 0.3660 0.3604


I 0.3679 0.3697 0.3754
2 0.1839 0.1849 0.1877
3 0.0613 0.0610 0.0600
4 0.0153 0.0149 0.0137
5 0.0031 0.0029 0.0024
6 0.0005 0.0005 0.0003
,,7 0.0001 0.0()()1 0.0000

Total I.()()OO 1.0000 0.9999

quite close for n = 25. Tables of individual and cumulative terms of the
Poisson are given in (9) and of individual terms up to Jl. = 15 in (10).
The fitting of a Poisson distribution to a sample will be illustrated by
the data (II) in table 8.14.2. These show the number of noxious weed
seeds in 98 sub-samples of Phleurn praetense (meadow grass). Each sub-
sample weighed 1/4 ounce, and of course contained many seeds, of which
only a small percentage were noxious. The first step is to compute the
sample mean.
fJ = (r.jr)/(l:.f) = 2%/98 = 3.0204 noxious seeds per sub-sample
TA8LE 8.14.1
DISTRIBUTI()N Of NVMBER OF NOXIOUS WEED SEEDS FOUND IN N ==' 98
SUB-SAMPLES. WITH FITTED PolSSON DISTRIBUTION

Number of
Noxious Seeds FrequcfICY Poissoll Expttted
r f Multipliers Frequency

0 3 I - 1.0000 4.781
I 17 P - 3.0204 14.440
2 26 P!2 - 1.5102 21.801
3 16 Pl3 - 1.0068 21.955
4 13 pj4 - 0.'551 16.5'3
5 9 PIS - 0.6041 10.015
6 3 Al6 - 0.5034 5.042
7 5 jl(7 - 0.4315 2.116
8 0 iJ/8 - 0.3156 0.811
9 I PI9 - 0.3356 0.214
10 0 PliO - 0.3020 0.083
11 or more 0 PIli - 0.2746 0.030

Total
- __,_
98 91.998
225
Next, calculate the successive terms of the Poisson distribution with
mean p. The expected number of sub-samples with 0 seeds is
Ne-' = (98)(e- ,.0204). A table of natural logs gives e-'·0204 = 1/20.5,
and 98/20.5 = 4.781. Next, form a column of the successive multipliers
I, p, p.J2, ... as shown in table 8.14.2. recording each to at least four
significant digits. The expected number of sub-samples with r = I is
(4:781)(p.) = 14.440. Similarly, the expected number with r = 2 is
(14.44O)(P/2) = (14.440)(1.5102) = 21.807. and so on. The agreement be-
tween observed and expected frequencies seems good except perhaps for
r = 2 and r = 3, which have almost equal expected numbers but have ob-
served numbers 26 and 16. A t~t of the discrepancies between observed
and expected numbers (section 9.6), shows that these can well be accounted
for by sampling errors.
Two important properties hold for a Poisson variate. The variance
of the distribution is equal to its mean, 1'. This would be expected, since
the binomial variance, npq, tends to np when q tends to I. Secondly. if a
series of independent variates X" X 2 , X" ... each follow Poisson distribu-
tions with means 1'" 1'2' 1'" ... , their sum follows a Poisson distribution
with mean (I', + 1'2 + 1', + ... ).
In the inspection and quality control of manufactured goods. the
proportion of defective articles in a large lot should be small. Conse-
quently, the number of defectives in the lot might be expected to follow a
Poisson distribution. For this reason, the Poisson distribution plays an
important role in the development of plans for inspection and quality
contro!. Further, the Poisson is often found to serve remarkably well as
an approximation when I' is small, even if the value of n is ill-defined and
if both n andp presumably vary from one sample to another. A much-
quoted example of a good fit of a Poisson distribution, due to Bortke-
witch, is to the number of men in a Prussian army corps who were killed
during a year by the kick of a horse. He had N = 200 observations, one
for each of IO corps for each of 20 years. On any given day, some men
were exposed to a small probability of being kicked, but is not clear what
value n has, nor that p would be constant.
The Poisson distribution can also be developed by reasoning quite
unrelated to the binomial. Suppose that signals are being transmitted,
and that the probability that a signal reaches a given point in a tiny time-
interval r is ).r, irrespective of whether previous signals have arrived
recently or not. Then the number of signals arriving in a finite time-
interval of length T may be shown to follow a Poisson distribution with
mean j.T (example 8.14.4). Similarly, if particles are distributed at
random in a liquid with density). per unit volume, the number found in a
sample of volume V is a Poisson variable with mean ,lV. From these
illustrations it is not surprising that the Poisson distribution has found
applications in many fields, including communications theory and the
estimation of bacterial densities.
EXAMPLE 8.14.1---n = 1.000 independent trials are made of an event with probability
226 Chapt.r 8: Samplinll From th. Binomial Diotri&utioft

0.001 at each trial. GiYe approximate results for the chances that (i) the event does not
happen. (ii) the event happens twice, (iii) the event happens at least five times. Am. (i) 0.368.
(ii) 0.184, (iii) 0.0037.
EXAMPLE 8.14.2-A. G. Arhous and J. E. Kerrich (l2) report the numbers ofacci-
dents sustainod during their first year by ISS engine shunters aged 31-35, as follows:

No. of accidents o 1 2 3 4 or more


No. of men 80 61 13 I o
Fit a Poisson distribution to these data. Note: the data were obtained as part ora study
of accident proneness. If some men arc particularly liable to ao;idents, this would imply
that the Poisson would not be a good fit. since p would vary from man to man.
EXAMPLE 8.l4.3-Student (13) counted the number of yeast cells on each of 400
squares ofa hemacytometer. In two independent samples, each of which gave a satisfactory
fit to a Poisson distribution, the total numbers of cells were 529 and 720. (i) Test whether
these totals are estimates of the same quantity, or in other words whether the density of
yeast cells per square is the same in the two populations. (ii) Compute 95% limits for the
difference in density per square. Ans. (i) z = 5.41. Pvery small. (ii) 0.30 toO.65. Note: the
nonnal approximation to the Poisson distribution, or to the difference between two inde-
pendent Poisson variates. may be used when the observed numbers exceed 15.
EXAMPLE 8.14 4~ The Poisson process formula for the number of signals arriving in a
finite time-interval T requires one result in calculus, but is othct:Wise a simple application of
probability rules. Let per, T + f) denote the probability that exactly r signals have arrived
in the interval from time 0 to the end of time (T + f). This event can happen in one of two
mutually exclusive ways: (I) (r - I) signals have arrived by time T, and one arrives in the
small interval T. Tbe probability of these two events is ATP(r - I, n. (ii) r signals have
already arrived by time T, and none arrives in the subsequent interval t. The probability
of these two events is (I - At)P(r, n. The interval T is assumed. so small that more than one
signal cannot arrive in this interval. Hence.
P(', T+ ,) ~ .!,P(, - I, n + (I - .!,)P(', n
Rearrangmg. we have
(P(" T + ,) - P(', n}/' = .!{P(, - I, n - P(', n}
Letting' tend to zero, wo get apt', 1)laT = .!{P(, - I, n-
P(" n}. By differentiating,
it will be found that per,n= e-,l.T(In·;r! satisfies this equation.

REFERENCES
I. F. MOSTELLER, R. E. K. ROUllKE. and G. 8. THOMAS, JR. Probabilily With Sialisticol
Applications. Addison-Wesley, Reading, Mass. (1961).
2. Data made available by Dr. Martin Frobisher.
3. E. W. R. BEST, C. B, WALKER, and P. M. BAKEIl, et al. A Canadian Study 00 Smoking
and Health (Final Report). Dept. ofNationa! Health and Welfare, Canada (1966).
4. D. P. MUllPHY and H. ABBEY. CtBI~r in Families. Harvard University Press. Cam-
bridge (1959).
5. T. R. HANSBEIUlY andC. H. RtCHAIlDSON. Iowa State Coil. J. Sci., 10:27(1935).
6. W. G. COCHIlAN and G. M. Cox. Exp~rimental Designs. Wiley, New York, 2nd ed., p.
17 (1957).
7. T. J. FRANCIS. et al. Evaluation of the 1954 Field Trial of fo/iomyeUlis Vaccine. Ed-
wards Bros., loc., Ann Arbor (1957).
8. S. D. PoISSON. kcltercMs sur /a probabilite desjugements. Paris (1837).
227

9. E. C. MOLINA, Poisson's Exponential Binomial Limit. Van Nostrand. New York:


(1942).
10. E. S. PEARSON and H. O. HARTLEY. Biometrika Tables/or Statisticians, Vol. I. Cam-
bridge University Press, Cambridge. England. 2nd ed. (1966).
J l. C. W. LEOOATT. Complex relldus de fassociation imernati01UJle ti'esSDLs de semences,
5; 27 (1935).
12. A. G. Aksous and J. E. KERRICH. Biometrics, 7:340 (1951).
13. "Student." Biometrika, 5:351 (l907).
14. R. A. FISHER. Statistical Methods/or Research Workers. §21.02. Oliver and Boyd.
Edinburgh.
15. D. 1. FINNEY, R. LATSCHA, B. M. BENNETT, and P. Hsu. Tables/or Testing Significance
in a 1 ><.1 Contingency Table. Cambridge University Press. New York (J963).
16. National Bureau of Standards. Tables of the Binomial Probability Distribution. App
Math. Series 6 (1950).
17. Annals of the Computation Laboratory. Tables of the Cumulati~ Bifwmial Probability
Distribution. Harvard University, Vol. 35 (1955).

15
* CHAPTER NINE

Attribute data with more than


one degree of freedom

9.I-JulJoductioa. In chapter 8 the discussion of attribute data was


confined to the cases in which the population contains only two classes of
individuals and in which only one or two populations have been sampled.
We now extend the discussion to populations classified into more than
two classes. and to samples dl"llWD from more than two populations.
Section 9.2 considers the ~mplest situation in which the expected numbers
in the classes are completely specified by the null hypothesis.
9.1-Slagle cbsslficatioas wltb more tbaa two classes. In crosses
betweeo two types of maize. lindstrom (l) found four distinct types of
plants in the second geoeration. In a sample of 1.301 plants. there were
I. = 773 green
12 = 231 golden
I. = 238 green-striped
_~ = ~ J!olden:JUeen-stqoed
1301
According to a simple type of Meodelian inheritance. the probabilities
of obtaining these four Iypes of plants are 9/16. 3/16. 3/16, and 1/16,
respectively. We select this as the null hypothesis.
The X2 lesl in chapter 8 is applicable to any number of classes. Ac-
cordingly, we calculate the numbers of plants that would be expected in
the four classes if the null hypothesis were true. These numbers, and the
deviations (j - F), are shoWD below. .

F. = (9(16)(1301) = 731.9 I. - F. +41.1


F2 = (3/16)(1301) = 243.9 12 - F, = -12.9
F. = (3/16)(1301) = 243.9 I. - F. = - 5.9
F. = (1/16)(1301) = 81.3 f. - F. = -22.3
1301.0 ---0:0
Substituting in the formula for chi-square,
228
X' = r.u - £)'/E
2 (41.1)' (-12.9)' (-5.9)' (-22.3)'
X = 731.9 + + 243.9 +
243.9 81.3
= 2.31 + 0.68 + 0.14 + 6.12
= 9.25
In a test oflhis type, the number1)f "egrees offreedoin in X' = (Num.-
ber of classes) - I = 4 - I = 3 .. To remember this rule, note that there
are four deviations, one for each class. However, the sum of the four
deviations, 41.1 - 12.9 - 5.9 - 22.3, is zero. Only three of the devia-
tions can vary at will, the fourth being fixed as zero minus the sum of the
first three.
Is X2 as large as 9.25, with dj. = 3, a common event in sampling from
the population specified by the null hypothesis 9: 3: 3: I, oris it a rare one?
For the answer, refer to the X2 table (table A 5, p. 550), in the line for
3 dJ, You will find that 9.25 is beyond the 5% point, neaf the 2.5% point:
On this evidence the null hypothesis would be rejected.
When there are more than two classes, this X' test is usually only a
first step in the examination of the data. From the test we have learned
that the deviations between observed and expected numbers are too large
to be reasonably attributed to sampling fluctuations. But the X' test does
not tell us in what way the observed and expected numbers differ. For
this, we must look at the individual deviations and their contributions to
'1'. Note that the first class, (green), gives a large positive deviation +41.1
and 15 the only class giving a positive deviation. Among the other classes,
the last class (golden-green-striped) gives the largest deviation, - 22.3,
and the largest contribution to X', 6.12 out ofa total of 9.25. Lindstrom
commented that the deviations could be largely explained by a physio-
logical cause, namely the weakened condition of the last three classes due
to ~heir chlorophyll abnormality. He pointed out in particular that the
last class (golden-green-striped) was not very vigorous.
To illustrate the type of subsequent analysis that is often necessary
with more than two classes, let us examine whether the data are consistent
with the weaker hypothesis that the numbers in the first three classes are
in the predicted Mendelian ratios 9: 3: 3. If so, one interpretation of the
results is that the significant value of X2 can be attributed to poor survivor-
ship of the golden-green-striped class.
The 9: 3 : 3 hypothesis is tested by a x' test applied to the first three
classes. The calculations appear in table 9.. 2.1.
In the first class, FI = (0.6)(1242) = 745.2. and SO on' The value of
i is now 2.70, with 3 - I = 2 dj. Table A 5 shows that the probability
is about 0.25 of obtaining a X' as large as this when tbere are 2 dj.
We can also test whether the last class (golden-green-striped) has a
frequency of occurrence significantly less than would be e~pected from
ils Mendelian probability 1/16. For this we observe that 1242 plants fell
230 Chapter 9: AIfri&ute Data witft more ""'" 0 ... Degr. . of Freedom
TABLE 9.2.1
TEU OF THE MENDELIAN HyPOlHESIS IN THE FIllST TliltJ$ CLASSI!S

Hypothetical
Class f Probability F f-F (f - F)'/F

gr<en 773 9/15 - 0.6 745.2 +27.8 1.04


golden 231 3/15 = 0.2 248.4 -17.4 1.22
green-striped 238 3/15 = 0.2 248.4 -10.4 0.44

Total 1242 15/15 _ 1 1242.0 0.0 2.70

into the first tqree classes, which have total probability 15/16, as against
59 plants in the fourth class, with probability 1/16. The corresponding
expeCted numbers are 1219.7 and 81.3. In this case the X' test reduces to
that given in section 8.8 for testing a theoretical binomial proportion. We
have
, (1242 - 1219.7)' (59 - 81.3)'
X ~ 1219.7 + 81.3
( + 22.3)' ( - 22.3)3
~ 1219.7 + 81.3 "" 6.53,
with I dj. The significance probability is close to the I % level.
To summarize, the high value of X' obtained initially, 9.25 with 3
df, can be ascribed to a deficiency in the number of golden-green-striped
plants, the other three classes not deviating abnormally from the Men-
delian probabilities. (There may be also, as Lindstrom suggests, some de-
ficiencies in the second and third classes relative to the first class, which
would show up more definitely in a larger sample.)
This device of making comparisons among sub'groups of the classes
is useful in two situations. Sometimes, especially in exploratory work, the
investigator has no clear ideas about the way in which the numbers in the
classes will deviate from the initial null hypothesis: indeed, he may con-
sider it likely that his first x' test will support the null hypothesis. The
finding of a significant X' should be followed, as in the above example,
by inspection of the deviations to see what can be learned from them.
This process may lead to the construction of new hypotheses that are
t!'Sted by further x' tests among sub-groups of tho classes. Conclusions
drawn from this analysis must be regarded as tentative. because the new
hypotheses were constructed after seeing the data and should be strictly
tested by gathering new data.
In the second situation the investigator has some ideas about the
types of departure that the data are likely to show from the initial null
hypothesis; in other words, about the nature of the alternative hypothesis.
The best procedure is then to construct tests aimed specifically at these
types of departure. Often, the initial X' test is omitted in this situation.
This approach will be illustrated in later sections.
When calculating X' with more than 1 df, it is not worthwhile to
231
make a correction for continuity. The exact distribution of X' is still
discrete, but the number of different possible values of X' is usually large,
so that the correction, when properly made, produces only a small change
in the significance probability.
EXAMPLE 9.2.1--10 193 pairs of Swedish twins (2), 56 were of type MM (both male),
'72 of the type M F (one male, one female), and 65 of the type FF. On the hypothesis that a
twin is equally likely to be a boy or a girl and that the sexes of the two members of a twin
pair are determined independently. the probabilities of MM. MF, and FF pairs are 1/4, 1/2,
1/4. respectively. Compute the value of'/ and the signifieance probabIlity. Ans. x.l. = 13.27.
with 2 d,{. P < 0.005.
EXAMPLE 9.2.2-,~(n the pr~eding example we would expect the null hypothesis to
be false for two reasons. The probability that a twin is male is not exactly 1/2. This dis.
crepancy produces only minor effects in a sample of size 193. Secondly, identical twins are
always of the same sex. The presence of identical twins decreases the probability of ME
pairs and increases the prObabilities of M M and FF pairs. Construct Xl tests to answer the
questions: (i) Are the relative numbers of MM and FF pair~ (ignorinJl the MF pairs) in
agreement wlth the null hypothesis? (jj) Are the rel~fjvc numbers of tWJhs of like sex (M M
and FFcombined) and unlike sex (MF) in agreement with the null hypothesis'! Ans. 0).,2
(uncorrected) = 0.67, with I tI.f P> 0.25, (ii) X2 = 12.44. with I tI.f P very small. The
failure of the null hypothesis is due. as anticipated. to an excess of twm~ of like sex.
EXAMPLE 9.2.3-·ln section 1.14. 230 samples from binomial distributions with known
p were drawn, and ·f was computed from each sample. The observed dmi expected numbers
of .,! values in each of seven classes (taken from table 1.14.1) are as follows:

Obs. 57 59 62 32 \4 3 3 230
Exp. 57.5 57.5 57.5 34.5 \1.5 9.2 2.3 230.0

Test whether the deviations of observed from expected numbers are of a size that occurs
frequently by chance. Ans. X2 = 5.50. d,f = 6. P about 0.5.

EXAMPLE 9.2.4·- In the Lindstrom example in the lext, we had x/ (3 df) = 9.25.
This was followed by 122 (2 dI) = 2.70, which compared the first three cla~ses. and 1,2 == 6.53.
which compared the combined first three classes with the fourth class. Note that 122 + X,2
= 9.23, while x/ = 9.25. In examples 9.2.! and 9.2.2. x/ == I.ln. while the sum of the two
I-dj. chi-squares is 0.67 + 12.44 = 13.11. When a classification is divided into sub-groups
and a Xl is computed within each sub-group. plus <I X2 which compares the total frequencies
in the sub-groups. the df add up 10 the d), in the initial Xl, bUI the values 01';(2 do not add
up exactly to the initial i~. They usually add to a value that is fairly close. and worth noting
as a.clue to mistakes in calculation. "

9.3-Single c1assillcations with equal expectations. Often, the null


hypothesis specifies that all the classes have equal probabilities. In this
case, X2 has a particularly simple form. As before, let./; denote the ob·
served frequency in the jth class, and let n = I.J, be the total size of sample.
If there are k classes, tbe null hypothesis probability that a member of the
population falls into any class is p = Ilk. Consequently. the expected
frequency F; in any class is np = nlk = j, the mean of the J,. Thus.

with (k - I) df
232 C""'_ 9: Allribut. Dato witlr more Ilratt One Degre. 01 frMJom
This test is applied to any new table of random numbers. The basic
property of such a table is that each digit has a probability 1/10 of being
chosen at each draw. To illustrate the test, the frequencies of the first
250 digits in the random number table A I are as follows:

Digit o 2 3 4 5 6 7 8 9 Total
22 24 28 23 18 33 29 17 31 25 250

Only 17 sevens and 18 fours have appeared, as against 31 eights and


33 fives. The mean frequency f = 25. Thus, by the usual shortcut
method of computing the sum of squares of deviations, :Elf. - f)', given
in section 2.10,

l' = ;5 [(22)' + (24)' + ... + (25)' - (25O)i/l0] = 10.08,

with 9 d.f. Table A 5 shows that the probability of a x' as large as this
lies between 0.5 and 0.3: X' is not unusually large.
This test can be related to the Poisson distribution. Suppose that·
the t. are the numbers of occurrences of some rare event in a series of k
independent samples. The null hypothesis is that thet. all follow Poisson
distributions with the same mean p. Then, as shown by Fisher, the
quantity :Elf. - f)' If is distributed approximately as X' with (k - I) d.f.
To go a step further, the test can be interpreted as a comparison of the
observed variance of the t. with the variance that would be expected from
the Poisson distribution. In the Poisson distribution, the variance equals
the mean p, of which the sample estimate is f. The observed variance
among the f, is " = :Elf. - f) /(k - I). Hence
X' = (k - I) (observed variance)/(Poisson variance)
This X' test is sensitive in detecting the alternative hypothesis tbat
tbe f, follow independent Poisson distributions with dilfer~nt means p,.
Under tbis alternative, tbe expected value of x' may be shown to be,
approximately,

B(X') '" (k - I) + L• (p, - WI",


i-I

wbere fi is tbe mean of tbe 1',. If the null bypotbesis bolds, 1', = fi and X'
has its usual average value (k - 1). But any differences among tbe 1',
increase tbe expected value of X' and tend to make it large. Tbe test is
sometimes called a variance test of the bomogeneity of tbe Poisson dis-
tributi'on.
Sometimes the number of Poisson samples k is large. When com-
puting the variance, time may be saved by grouping the observations,
particularly if they take only a limited number of distinct values. To avoid
confusion in our notation, denote the numbers of occurrences by y, in-
233
stead of j" since we have used f's in previous chapters to denote the fre-
quencies found in a grouped sample. In this notation,

X' = ±
1=1
(y, -:: ji)l =
Y
f
)=1
fj(YJ- jill =
Y
~r
yL.l
f fiY/ _ (LJjYj)'/l:}}.~
where the second sum is over the m distinct values of Y. and Ii is the fre-
quency with which the Jth value of y appears in the sample. The df. are,
as before, (k - I).
If the d f. in Xl lie beyond.the range covered in table A 5, calculate
the approximate normal deviate

Z = N -
-/2(df.1 - 1 (9.3.1 )
The significance probability is read from the normal table, using
only one tail. For an illustration oftbis case, see examples 9.3.2 and 9.3.3.
EXAMPLE 9.3.1-ln 1951, the number of babies born With a harelip in Birmingham,
England. are quoted by Edwards (3) as follows:

Month Jan. Feb. Mar. Apr. May June July Aug. Sept. Oct. Nov. Dec.
Number 8 19 II 12 16 8 7 5 8 3 8 8

Test the null hypothesis that the probability ora baby with tlarelip is the same in each month.
Ans ../ = 23.5, d.f -== II. P between 0.025 and 0.0t. Strictly, the variable that should
be examined in studies of this type is the ratio: (number of babies with harelip)!(total number
of babies born), because even if this ratio is constant from month to month, the actual
number of babies with harelip will vary if the total number born varies. Edwards points out
that in these data the total number varies little and shows no relation to the variation in
number with harelip. He proceeds to fit the above data by a periodic (cosine) curve. which
indicates a maximum in March.
EXAMPLE 9.3.2-Leggau (4) counted the number of seeds of the weed poterttilla
found in 98 quarter-ounce batches of the grass Phleumpraetense. The 98 numbers varied
ftom 0 tp 7, and were grouped into the following frequency distribution.

Number of seeds o 2 J 4 567 Total

Number of batches 37 32 16 9 2 0 98

CaJculate ./. = l:.Jj(yj - y)2jy. Ans. l! = 145.4. with 97 dj. From table A 5, with
100 df.. P is clearly Jess than 0.005. The high value of X2 ,is du:e to the batches with six
and seven seeds. '.
EXAMPLE 9.3.3-Compute the significance probability in the preceding example by
finding the normal deviate Z given by equation 9.3.1. Ans.:z = 3.16. P "" 0.0008. The cor-
rect probability. found from a larger table of X1., is P = 0.0010.

9.4-Additiooal tests. As in section 9.2, the X' test for the Poisson
distribution can be supplemented or replaced by other tests directed more
specifically against the type of altel1lative hypothesis that the investigator
bas in mind. If it is desired to examine wbether a rare meteorological
event occurs more frequently in the summer months, we migbt compare
234 Chapler 9: Allribule Data w~h more lhan One Degree of Freedom
the total frequency in June, July, and August with the total frequency in
the rest of the year, the null hypothesis probabilities being very close to
1/4 and 3/4. .If a likely alternative hypothesis is that an event shows a
slow but steady increase or decrease in frequency over a period of nine
years, construct a variate !, = I, 2, 3, . , . 9 or alternatively - 4, - 3,
-2, ... +3, +4 (makmg X = 0), to represent the years. The average
change in the'.r. per year is estimated by the regression coefficient
'f.[,x Jr.x.', where as usual x, = X, - X, The value of X' for testing this
coefficient, against the null hypothesis that there is no change, is
X' = ('f.[,x,)' /Jr.x.',
with 2 d.f.
Another example is found in an experiment designed to investigate
various treatments for the control of cabbage loopers (insect larvae) (5).
Each treatment was tested on four plots. Table 9.4.1 shows, for five of
the treatments, the numbers of loopers counted on each plot. The objec-
tive of the analysis is to examine whether the treatments produced dif-
ferences in the average number of loopers per plot.
TABLE 9.4.1
NUMBER OF looPERS ON 50 CABBAGE PLANTS IN A PLor
(Four plots treated alike; five treatments)

No. of Loopers Plot Plot


Treatment Per Piot Total Mean X' df

1 11, 4,4, 5 24 6.00 5.6) 3


1 6, 4,3. 6 19 4.75 1.41 3
j 8, 6.4,11 29 7.25 3.69 3
4 14,27,8. 18 67 16.75 11.39 3
S 7, 4,9,14 34 8.50 6.24 3

Total 173 28.41 ' 15

Since the sum of a number of independent Poisson variables also


follows a Poisson distribution (section 8.14). we can compare the treat-
ment totals by the Poisson variance test, provided we can adopt the
assumption that the counts on plots treated alike follow the same Poisson
distribution. To test this assumption, the X' values for each treatment are
computed in table 9.4.1 (second column from the right). Although only
one of the five X' values is significant at the 5% level, their total, 28.41,
d.f. = IS, gives P of about 0.Q2. This finding invalidates the use of the
Poisson variance test for the comparison of treatment totals. Some addi-
tional source of variation is present, which must be taken into account
when investigating whether plot means differ from treatment to treat-
ment. Problems of this type, which are common, are handled by the
technique known as the analysis of variance. The analysis of these data
is completed in example 10.3.3, p. 263.
235
Incidentally, the Poisson variance X' for comparing the treatment
totals would be computed as
X' = l:Cr, - f)'1f
= [(24)' + (19)' + " . + (34)' - (173)'/5]/34.6 = 41.5,
with 4 dj. The high value of this x' suggests that the variation between
treatments is substantially greater than the variation within trealmenls-
the point 10 be examined in the analysis of variance tesl.
EXAMPLE 9.4.I-In section 8.4. random numbers were used to draw 100 samples
from the binomial It = 5, p = 0.2. The observed and expected frequencies (taken from
table 8.4.1) are as follows:

No. of Successes o 2 3 4 5 Total

Observed frequency 32 44 17 6 1 0 100


Expected frequency 32.77 40.96 20.48 5.12 0.64 O.Q3 100.00

Compute ·l and test whether the deviations can be accounted fot by sampling errors.
Ans. 1.2. = 1.09, df. == 3. P about 0.75. (Combjne classes 3, 4,5 before computing _il.)

9.5-The X' test when the expectations are SIIlall. The X' test is a
large-sample approximation, based on the assumption that the distribu-
tions of the observed numbers;; (or y,) in the classes are not far from
normal. This assumption fails when some or all of the observed numbers
are very small. Historically, the advice most oflen given was that the
expected number in any class should not be less than 5, and that, if neces-
sary, neighboring classes should be combined to meet this requirement.
Later research, described in (6), showed that this restriction is too strict.
Moreover, the combination of classes weakens the sensitivity of the X'
test.
We suggest that the X' test is accurate enough if the smallest expecta-
tion is at least I, and Ihat classes be combined only to ensure this condition.
This recommendation applies to the X2 tests of single classifications de-
scribed in sections 9.2, 9.3, and 9.4. When counting the df. in x', the
number of classes is the number after any necessary combinations have
been made.
In more extreme cases it is possible to work out the exact distribution
of 1'. The probability that;; observations fall in the ith class is given
by the muitilWmiai aistribution
II'
--;-:-:--;-._--::-: pI, P h P I.
fl!/,! ... I.! I 1 ... • ,

where the Pi are the probabilities specified by the null hypothesis. This
distribution reduces to the binomial distribution when there are only two
classes. This probability is evaluated, along with the value of x', for
every possible set offi with t;; ~ n.
When the expectations are equal (section 9.3), Chakravarti and Rao
236 Chapter 9: Attribute Data with more than One Degree of freedom
(7) have tabulated the exact 5% levels of X' for samples in which n = 'i:.j.
S 12 and the number of classes, k, ~ 100. Our 'i:.f, is their Tand. our k
is their! Their tabulated criterion (in their table I) is our 'i:.P, which is
equivalent to X' and quicker to compute.
EXAMPLE 9.S.I-When 5 dice were tossed 100 times. the observed and expected
numbers of 2'5 out of 5 were as follows (data from example 1.9.8):

Number of 2's f F

5 2 0.013
4 3 0.322
3 3 3.214
2 18 16.075
I 42 40.188
0 32 40.188
Total 100 100.000
Applying the rule that the smallest expectation should be at ieast 1, we would combine
classes 5. 4, 3. Verify that this gives 12 = 7.56, dj. = 3, P slightly above 0.05. Note that
if we combined only the first two classes, this would give X2 = 66.45, df. = 4.

9.6-Single classifications with estimated expectations. In sections


9.2 and 9.3, the null hypothesis specified the actual numerical values of
the expectations in Ihe classes. Often the null hypothesis gives these ex-
pectations in terms of one or more population parameters that must be
estimated from the sample. This is so, for instance, in testing whether
the observed frequencies of 0, I, 2, ... occurrences will fit the successive
terms of a Poisson distribution. Unless the null hypothesis provides the
value of Jl. this must be estimated from the sample in order to calculate
the expected frequencies. The estimate of Jl is. of course, the sample
mean.
The data of table 8.14.2, to which we have already fitted a Poisson
distribution, serve as an example of the test of goodness of fit. The data
and subsequent calculations appear in table 9.6.1. Having obtained the
expected frequencies. we combine the last four classes (8 or more) so as
to reach an expectation of at least 1. The deviations (f - F) and the
contributions (f - F)'/F to X' are calculated as usual and given in the
last two columns. We find X' = 8.26.
The only new step is the rule for counting the number of df in /:
df = (No. of classes) - (No. of estimated parameters) - I
In applying this rule, the number of classes is counted after mak-
ing any combination of classes that is necessary because of small ex-
pectations. Each estimated parameter places one additional restriction on
the sizes of the deviations (j - F). The condition that 'i:.(j - F) = 0
also reduces the likely size of X'. In this example the number of classes
(after combining) is 9, and one parameter, Jl, was estimated in fitting the
237
TABLE 9.6.1
Xl TEST OF GOODNFSS OF FIT OF THE PoiSSON DISTRIBUTION, ApPLIED TO THE NUMBERS
OF NOXIOUS WEED SEEDS FOUND IN 98 BATCHES

Observed Contribution
No. of Observed Expected - Expected to Xl. <1- F)l
Noxious Seeds Frequency U,) Frequency (F) (f-F) . F

0 3 4.78 -1.78 0.66


I 17 14.44 +2.56 0.45
2 26 21.81 +4.19 0.80
3 16 21.96 -5.% 1.62
of 18 16.58 + 1.42 0.12
5 9 10.02 -1.02 0.10
6 3 5.04 -2.04 0.83
7 5 2.18 +2.82 3.65

~}I
8
9 082 }
0.27 1.20 -0.20 0.03
10 0.08
11 or more 0.03

Total 98 98.01 -0.01 8.26

distribution. Hence, there are 9 - I - I = 7 df The P value lies be-


tween 0.50 and 0.25. The fit is satisfactory.
Tests of this kind, in which we compare an observed frequency dis-
tribution with a theoretical distribution like the Poisson, the binomial,
or the normal, are called goodness offit tests. For the binomial, tbe d.!.
are 2 less than the number of classes if p is estimated from the data, and I
less than the number of classes if p is given in advance. With the normal,
both parameters I' and u are usually estimated, so that we subtract 3 from
the number of classes.
You now have two methods of testing whether a sample follows the
Poisson distribution, the goodness of fit test of this section and the vari-
ance test of section 9.3. If the members of the population actually follow
Poisson distributions with different means, the variance test is more sensi-
tive in detecting this than the goodness of fit test. The goodness of fit
test is a general-purpose test, since any type of difference between the
observed and expected numbers, if present in sufficient· force, makes X'
large. But if something is known about the nature of the alternative
hypothesis, we can often construct a dilkrent test that is more powerful
for this type of alternative. The same remarks apply to the binomial
distribution. A variance test for the binomial is given in section 9.8.
EXAMPLE 9.6. I-The numbers of tomato plants attacked by spotted wilt disease
were counted in each of 160 areas of9 plants (8). In all, 261 plants were diseased out of
9 x 160 = 1440 plants. A binomial distribution with n = 9, P = 261/1440. was fitted to the
distribution of numbers of diseased plants out of9. The observed and expected numbers are
as follows.
238 Chapter 9: Attribute Data with more than One Oegr. . of Freedom

No. of Diseased
Plants o 2 3 4 5 6 7 Total

Observed frequency 364838231.0311 160


Expected frequency 26.45 52.70 46.67 24.11 8.00 1.77 0.25 0.03 159.98

Perform the Xl goodness of fit test. ADS. Xl = 10.28, with 4 dJ. after combining.
p < 0.05.
EXAMPLE 9.6.2-ln a series of trials a set of r successes, preceded and followed by a
failure, is called a run of length r_. Thus the series FSFSSSF contains one run of successes
of length I and one of length 3. If the probability of a success is p at each trial, the prob-
ability of a run of length r may be shown to be pr~ lq. In 207 runs of diseased plants in a field.
the frequency distribution of lengths of run was as follows:

Length of run , I 2 3 4 5 Total'


Observed frequency f. 164 33 9 I o 207

The estimate of p from these data is ft = (T - N)/T, where N = I./, = 207 is the total number
of runs and T = l:rf, is the total number of successes in these runs. Estimate p: fit th~ dis-
tribution, called the geomerric distriburion; and test the fit by X2 . Ans. x 2 = 0.96 with 2 df
P> 0.50. Note: the expression (T - N)/T. used for estimating p. is derived from a general
method of estimation known as the method of maximum likelihood. and is not meant to
be obvious. The expected frequency of runs of length r is Np' - I q.

EXAMPLE 9.6.3-ln table 3.4.1 (p. 71) a normal distribution was fitted to·511 means
of samples of pig weight gains; Indicate how you would combine classes in making a good-
ness of fit test. How many df. does your X2 have? Ans. 17 df.
EXAMPLE 9.6.4-Apply the variance test for the Poisson distribution to the data in
table 9.6.1. Ans. x. 2 = 105.3 with 97 df. P > 0.25.

9.7-Two-wayclassifications. The 2 X CCOIItingencytabie. Wecome


now to data classified by two different criteria. The simplest case (the
2 x 2 table), in which each classification has only two classes, was dis-
c",~~~d ,,., ,""c\'o,., £.. \\\. lbe "'0"\ ~'mp\e>\ ca-.e """un wben o,.,e c\a~~'nca­
tion has only two classes, the other having C > 2 classes. In the example
in table 9.7.1, leprosy patients were classified at the start of an experiment
according as to whether they exhibited little or much infiltration (a mea-
sure of a certain type of skin damage). They were also classified into five
TABLE 9.7.1
196 PATIENTS CLASSIFIED AcCOIlOlNG TO CHANGE IN HEAUH AND ~It.EE OF INflLTIlATION

Change in Health
Impr-ovement Stationary WorSA! Total
Dqreeof
Infiltration Marked lI6o<krate Slight

Litt~ II 27 42 53 II 1401
Much 7 IS 16 13 I 52

Total 18 42 58 66 12 196
239
classes according to the change in their general health during a subse-
quent 48-week period of treatment (9). The patients did not all receive
the same drugs. but since no differences in the effects of these drugs could
be detected. the data were combined for this analysis. The table is called
·a 2 x 5 cOnlingency tab/e.
The question at issue is whether the change in health is related to the
initial degree of infiltration. The X' test extends naturally to 4 x C tables.
The overall proportion of patients with little infiltration is 144/196. On
the null hypothesis of no relationship between degree of infiltration and
change in health. we expect to find (18)(144)/196 = 13.22 patients with
little infiltration and marked improvement. as against II observed. As
before. the rule for finding an expected number is (row total)(column
total)/(grand total). The expected numbers F and the deviations (f - F)
are shown in table 9.7.2. Note that only four expected numbers need be
calculated: the rest can be found by subtraction.

TABLE 9.7.2
EXPECTED NUMBERS AND DEVIATIONS CALCULATED FROM TABLE 9.7.1

Change in Health
Improvement Stationary Worse Total
Degree of ~-.

Infiltration Marked Moderate Slight


_..
Expected numbers, F
little 13.22 30.86 42.61 4H.49 8.82 144.00
Much 4.78 11.14 15.39 17.51 3.18 52.00

Total 18.00 42.00 58.00 66.00 12.00 196.00

Det'ialions. (J - F)
Little -2.22 - 3.86 -0.61 +4.51 +2.18 0.00
Much +2.22 +3.86 +0.61 -4.51 -2.18 0.00

The value o(X' is


x' = E(f - F)'/F .
= (-2.22)'/13.22 + (+2.22)'/4.78 + '" + (-2.18.)'/3.18 = 6.87.
taken over the ten cells in the table. The number of df. is (R - I)(C - I).
where R. C are the numbers of rows and columns. respectively. In this
example R = 2. C = 5 and we have 4 df This rule for df is in line with
the fact that when four of the deviations in a row are known. all the rest
can be found. With X' = 6.87. df. = 4. the probability lies between 0.25
and 0.10.
Although this test has not rejected the null hypothesis. the devia-
tions show a systematic pattern. In the "much infiltration" class. the ob-
served numbers are higher than expected for patients showing any degree
of improvement, and lower than expected for patients classified as sta-
240 Chopt.r 9: Attribute Data with more than 0 .... Dggr.... 01 Fr.edom
tionary or worse. The reverse is, of course, true for the "little infiltration"
class. Contrary to the null hypothesis, these deviations suggest that
patients with much infiltration progressed on the whole better than those
with little infiltration. This suggestion will be studied further in section
9.10.
9.8-The variance test for homogeneity of the binomial distribution.
In the preceding example we obtained a 2 x C contingency table because
the data were classified into 2 classes by one criterion and into C classes by
a second criterion. Alternatively, we may have recorded some binomial
variate PI = a,/n, in each of C independent samples, where i goes from
I to C and ", is the size of the itb. sample. The objective now is to examine
whether the true Pi. vary from sample 10 sample. Data of this type occur
very frequently.
A quicker method of computing X' which is particularly appropriate
in this situation was devised by Snedecor and Irwin (10). It will be illus-
trated by the preceding example. Think of the columns in table 9.8.1
as representing C = 5 samples.
TABLE 9.8.1
ALTERNATIVE C."LCVLATlON(Jf."':1. fOR THf DATA IN TABLE 9.7.1

Intprovement
Degree of
Infikration Marked Moderate Slight Stationary Wo"" Total

Lita. II ~7 42 53 II 144
Mucb(.~ 7 15 16 13 I 52 (Al

Talal (n,) 18 42 58 66 12 196 (N)


p,=a,/", 0.3889 0.3571 0.2759 0.1970 0.0133 0.26531 (p)

First calculate the proportion PI = a'/", of "much infiltration" pa-


tients in each column, and the corresponding overall proportion p = A/N
- 52/196 = 026531. Then,
1. 2 = (:Ep,a, - PA)/N
= 1(0.3889)(7) + ... + (0.0833)(1)
- (0.26531 )(52) 1/(0.26531)(0.73469)
= 6.1111, (9.S.I)
as before, with 4 df.
If p, is the variable of interest, you will want to calculate these values
anyway in order to examine the results. Extra decimals should be carried
to ensure accuracy in computing X2, particularly when the al are large.
The computations are a little simpler when the p, are derived from the row
with the smaller numbers.
This fonnula for X' can be written, alternatively.
X' = :En,(p, - p)'/N (9.8.2)
241
If the binomial estimates p, are all based on the same sample size n,
X' becomes
c
X' = ~ (p, - iW/(pij/n) = (e - l)s.'/(pij/n) (9.8.3)
i'= I

In this form, x> is essentially a comparison of the observed variance


s/ among thep, with the variance pij/n that thep, would have if they were
independent samples from the same binomial distribution. The same
interpretation can be shown to apply to expression (9.8.2) for X'. A high
value of X' denotes that the true proportions differ from sample to sample.
This test, sometimes called the variance test for homogeneity of the
binomial distribution, has many applications. Different investigators
may have estimated the same proportion in different samples, and we
wish to test whether the estimates agree, apart from sampling errors. In
a stl\dy of an attribute in human families, where each sample is a family,
a high value of X' indicates that members of the same family tend to be
alike with regard to this attribute.
When some of the sample sizes n, are small, some of the expectations
njJ and n,li will be small. The X' test can still be used with some expecta-
tions as low as I; provided that most of the expectations (say 4 out of 5)
are substantially larger. (Recent results [II J suggest that this advice is
conservative.) In some genetic and family studies, all the n, are small.
For this case a good approximatioD to the significance levels of the exact
X' distribution has been given by Haldane (12), though the computations'
are laborious. When X' has more than 30 dj and the n, are all equal
(= n) the exact X' is approximately normally distributed with
Mean = (e - I)N/(N - 1)

l'8.-iBn«= 2{C - llf; I)(N _ I)'(:~ 2)(N _ 3) &- A~~ =- ~a


= 2(e - l)f" ~ 1)[1 + ~(7 - ;q)}
where C is the number of samples and N = Cn.
When the p, vary from column to column, as indicated by a high
value of X', the binomial formula .J(pii/N) underestilllates the standard
error of the overall proportion p for the combined sample. A more
nearly correct formula (section 17.5) for the standard error of Ii in this
situation is 1
s .•. (p) = Ii J(:ta,' - 2p:ta,n, + p':tn,')/C(C - I), (9.8.4)

where C is the number of samples and

°i _ N
Pi =-- n=-
n, C
242 Chapt.r 9: Attribut. Dolo with ,,_. ,,_, 0 ... Degr•• of Freedom
EXAMPLE 9.8. I-Ten samples of 5 mice from the same laboratory were injected with
the same dose of bact. /yphi, murium (13). The numbers of mice dying (out of 5) were as
follows: 3, 1, 5, 5, 3, 2, 4,,2, 3, 5. Test whether the proportion dying can be regarded as
constant from sample to sample. Ans. X2 = 18.1, df. = 9. P < 0.05. Since the death rate
is found so often to vary within the same laboratory, a standard agent is usually tested along
with each new agent, because comparisons made over time cannot be trusted.
EXAMPLE 9.8.2-Uniform doses of Danysz bacillus were injected into rats, the sizes
of the samples being dictated by the numbers of animals available-at the dates of injection.
These sizes. the numbers of sur:viving rats, and the proportion surviving, are as follows:

Number in sample 40 12 22 II 31 20
Number surviving 9 2 3 1 2 3

Proportion surviving 0.2250 0.1661 0.1364 0.0909 0.0541 0.1500

Test the null hyPOthesis'that the probability of survival is the same in all samples. Ans.
X' - 4.91, df. - 5, P - 0.43.
EXAMPLE 9.8.3-ln another test with four samples of inoculated rats, X2 was 6.69,
P = 0.086. Combine the values of Xl for the two tests. Ans. Xl = 11.66. df. = 8, P = 0.17.
EXAMPLE 9.8.4-Burnett (14) tried the effect affive storage locations on the viability
of seed corn. In the kitchen garret, 111 kernels germinated among 120 tested; in a closed
toolshed, 55 out of 60; in an open toolshed. 55 out of 60; outdoors, 41 out of 48; and in a
dry garret, 50 out of 60. Calculate Xl = 5.09, df. = 4, P = 28?~.
EXAMPLE 9.8.5-10 13 families in Baltimore. the numbers of persons (n j ) and the
numbers (a;) who had consulted a doctor during-the previous'12 months were as follows:
7,0; 6, 0; 5, 2; 5, 5; 4,1; 4, 2; 4, 2; 4, 2; 4, 0; 4, 0; 4,4;4,0; 4, O. Compute the overall per-
centa:ge who had consulted a doctor and the standard error of the percentage. Note: One
would expect the: proportion who had seen a doctor to vary from family to family. Verify
this by finding l = 35.6, df = 12, P < 0.005. Consequently, fonnula 9.8.4 is used to
estimate the s.e. of p. Ans. Percentage = 1001' = 30.5%. S.e. = 10.5%. (These data were
selected from a large sample for illustration.)

9.9-Further examination of the data, When the initial X2 test shows


a significant value, the remarks made in section 9.2 about further examina-
tion of the data apply here also. Subsequent tests are made that may help
to explain the high value of Xl. Frequently, as already remarked, the in-
vestigator proceeds at once to these tests, omitting the initial X' test as not
informative.
Decker and Andre (15) investigated the effect of a short, sudden ex-
posure to cold on the adult chinch bug. Since experimental insects had
to be gathered in the field, the degree of heterogeneity in the insects was
unknown, and the investigators faced the problem as to whether they
could reproduce their results. Ten adult bugs were placed in each of 50
tubes and exposed for 15 minutes at - soc. Forthis illustration the counts
of the numbers dead in the individual tubes were combined at random
into 5 lots of 10 tubes each; that is, into lots of 100 chinch bugs. The
numbers dead were 14, 14, 23, 17, and 20 insects. From these data.
X' = 4.22, df. = 4, P = 0.39. The results are in accord with the hy-
pothesis that every adult bug was subject to the same chance of being
killed by the exposure.
243
In a second sample of 500 adults, handled in the same manner except
that they were exposed at - 9'C., the numbers dead in groups of 100 were
38, 30, 30, 40, 27. The X' value of 5.79 again verifies the technique,
showing only sampling variation from the estimated mortality of 33'}~.
The gratifying uniformity in the results leads one to place some con-
fidence in the surprising finding that the death rates at - 8"C and - 9'C.
were markedly different. The total numbers dead in the two samples
of 500 were 88 and 165. The result, X' = 31.37 with df. = I, P less than
0.0002. provides convincing evidence that a rise in mortality with the
lowering of temperature from - 8'C. to - 9' C is a characteristic of the
population, not merely an accident of sampling.
The ease of applying a test of experimental technique makes its use
almost a routine procedure except in highly standardized processes. It is
necessary merely to collect the data in several small groups, chosen with
regard to the types of experimental variation thought likely to be present,
instead of in one mass. The additional information may modify conclu-
sions and subsequent procedures profoundly.
In this example the sum of the three values ofX2 is 4.22 + 5.79 + 31.37
= 4.1.38, with 9 df If the initial X' is calculated from the 2 x 10 con-
tingency table formed by the complete data, its value is also found to be
41.38, with 9 d,t: This agreement between the two values is a fluke. which'
does not hold generally in 2 x C tables. For 2 x C and R x C tables, a
method of computing the component parts so that they add 10 the initial
total X2 is available (16). In these data this method amounts to using the
same denomipator pij = (0.253)(0.747), calculated from the total mortal-
ity. in finding all X' values. Instead. for the 4 df. X' at _8°C. we used
pij = (0.176)(0.824), appropriate to that part of the data, and at -9"C.
we used pq = (0.330)(0.670). The additive x2 values give 3.24 + 6.77
+' 31.37 = 41.38. However, when it has been shown that the mortality
differs at - 8 C and - 9"C .. use of a pooled p for the individual homo-
geneity tests at - 8 C and - 9"C. is invalid. The non-additive method
is recommended. except in a guick preliminary look at the data.

9. to-Ordered classifications. In the leprosy example of section 9.7,


the classes (marked improvement, moderate improvement, slight im-
provement. stationary, worse) are an example of an ordered da"Ss(ficalion.
Such classifications ate common in the study of human behavior and
preferences, and more generally whenever different degrees of some phe-
nomenon can be recognized. The problem of utilizing the knowledge
that we posses~ about this ordering has attracted considerable attention
in recent years.
With a single cla&sification of Poisson variables, the ordering might
ledd us to expect that itthc null hypothesis!(; = I' does not hold, an alterna-
t.ve 1'. s 1'2 S 1'" s should hold, where the subscripts represenl the
order. For instance, if working conditions in a factory have been classi-
fied as Excellent, Good, Fair. we might expecuhat if the number of defec-
tive articles per worker varies with working conditions. the order should

16
244 Chopter 9: AHribute Data with more ,ltan On. Degree of fre.dom
be III ,; 11, ,; 11,· Similarly, with ordered columns in a 2 x C contingency
table, the alternative P, ,; p, ,; p, ,; might be expected. X' tests designed
to detect this type of alternative have been developed by Bartholomew
(I 7). The computations are quite simple.
Another approach, used by numerous workers (9), (18), (19), is to
attach a score to each class so that an ordered scale is created. To ilIus·
trate from the leprosy example, we assigned scores of 3,2, I, respectively,
to the Marked, Moderate, and Slight Improvement classes, 0 to the Sta-
tionary class, and - I to the Worse class. These scores are based on the
judgment that the five classes constructed by the expert represent equal
gradations on a continuous scale. We considered giving a score of + 4
to the Marked Improvement class and -2 to the Worse class, since the
expert seemed to examine a patient at greater length before assigning him
to one of these extreme classes, but rejected this since our impression may
have been erroneous.
Having assigned the scores we may think of the leprosy data as
consisting of two independent samples of 144 and 52 patients, respec-
tively. (See table 9.10.1.) For each patient we have a discrete measure
X of his change in health, where X takes only the values 3, 2, 1,0, -I.
We can estimate the average change in health for each sample, with its
standard error, and can test the null hypothesis that this average change is
the same in the two populations. For this test we use the ordinary two-
,ample I-test as applied to grouped data. The calculations appear in
table 9.10.1. On the X scale the average change in health is + 1.269 for
patients with much infiltration and + 0.819 for those with little infiltration.
The difference, D, is 0.450, with standard error iO.I72 (194 df), com-
puted in the usual way. The value of I is 0.450/0.172 = 2.616, with
P < 0.01. Contrary to the initial X' test, this test reveals a significantly
greater amount of progress for the patients witb much infiltration. .
The assignment of scores is appropriate when (i) the phenomenon
in question is one that could be measured on a continuous scale if the
instruments of measurement were good enough, and (ii) the ordered classi-
fication can be regarded as a kind of grouping of this continuous scale, or
as an attempt to approximate the continuous scale by a cruder scale that is
the best we can do in the present state of knowledge. The process is
similar to that which occurs in many surveys. The householder is shown
flve specific income classes and asked to indicate the class within which
his income falls, without naming his actual income. Some householders
name an incorrect class, just as an expert makes some mistakes in classi·
fication when this is difficult.
The advantage in assigning scores is that the more flexible and power-
ful methods of analysis that have been developed for continuous variables
become available. One can begin' to think of the sizes of the average
differences between different groups in a study, and compare the dif-
ference between groups A and B WIth that between groups E and F.
Regressions of the group means X on a further variable Z can be worked
24.5
TABLE 9.10.1
ANAL"SIS Of TltE LEPROSY D,4,T" BYAssIGNED ScOilES
(Data with assigned sc:o~)

Change in Infiltration
Heallh Little: Much

No. 01 palienlS
x f f
J II 7
2 27 15
1 42 16
o 53 13
-I II )

Total : 'If 144 52

(ComputatiollS)
Liltlr Much
IIX 118 66
J( - 'IIX/II 0.819 1.269
'IIX1 260 140
('I/X)l/tf 96.7 83.8

tfxl 163.3 56.2


J,J. 143 51
$1 1.142 1.102
Pooled $1 ).J3)

SlJl (J.'JI)(~ + _!_) _ 0.0296


144 52
slJ 0.172
D 1.269 - 0.819 .. 6
, - - - - ...61
sa 0.172
til - 194. P < 0.01

out. The relative variability of different groups can be examined by


computing s for each group.
This approach assumes that the standard methods of analysis of
continuous variaoJes, like the I-leSt, can be used with an X variable that
is discrete and takes only a few values. As noted in section 5.8 Qn scales
with limited values, the standard methods appear to work well enbugh for
practical use. However, heterogeneity of variance and correlation be-
tween .r and X are more frequently encountered because of the discrete
scale. If most of the patients in a I(oup show marked improvement,
most of their X's will be 3, and r
will be small. Poolin& of variances
should not be undertaken without examining the individual Sl. In the
leprosy example the two.r1 were J.J42 aDd J.J02 (rable 9.10.1), and this
difficulty was not present.
246 Chapt.r 9: AHribute Data with mare "'an One Degree of freedom
The chief objection to the assignment of scores is that the method
is more or less arbitrary. Two investigators may assign different scores
to the same set of data. In our experience, however, moderate differences
between two scoring systems seldom produce marked differences in the
conclusions drawn from the analysis. In the leprosy example, the alterna-
tive scores 4, 2, 1,0, -2 give 1= 2.549 as against t = 2.616 in the analysis
in table 9.10.1. Some classifications present particular difficulty. If the
degrees of injury to persons in accidents are recorded as slight, moderate,
severe, disabling. and fatal, there seems no entirely satisfactory way of
placing the last two classes on the same scale as the first three.
Several alternative principles have been used to construct scores. In
studies of different populations of school children, K. Pearson (20) as-
sumed that the underlying continuous variate was normally distributed
in a standard population of school children. If the classes are regarded as
a grouping of this normal distribution, the class boundaries for the normal
variate are easily found. The score assigned to a class is the mean of the
normal variate within the class. A related approach due to Bross (21) also
uses a standard population but does not assume normality. The score
(ridil) given to a class is the relative frequency up to the midpoint of that
class in the standard population. When the experimental treatments are
different doses of a toxic or protective agent in biological assay. Ipsen (21)
shows how to assign scores so that the resulting variate has a linear regres-
sion on some chosen function of the dose, the ratio of the variance due
to regression to the total variance being maximized. Fisher (23) assigns
scores so as to maximize the F-ratio of treatments to experimental error
as definod in section 10.5. The maximin method of Abelson and Tukey
(24), maximizes the square of the correlation coefficient " between the
assigned scores and the set of true scores, consistent with the investigator's
knowledge about the ordering of the classes, that gives a minimum cor-
relation with the assigned scores. This approach, like Bartholomew's,
avoids any arbitrary assumptions about the nature of the true scale.
EXAMPLE 9.10.1 -·In the leprosy data, verify the value of t = 2.549 quotedJor the
~oring 4, 2, I, D. - 2.

9.11-Test for a linear trend in proportions. When interest is centered


on th~ proportions p; in a Z x C contingency table, there is another way
of viewing the data. Table 9.11.1 shows the leprosy data with the assigned
scores X;, but in this case the variable that we analyze is p;, the proportion
of patients with much infiltration. The contention now is that if these
patients have fared better than patients with little infiltration, the values
of p; should increase as we move from the Worse class (X = - I) towards
the Marked Improvement class (X = 3).
If this is so, the regression coefficient of p; on X; should be a good test
criterion. On the null hypothesis (no relation between p; and X;) each p;
is distributed about the same mean, estimated by p. with variance pq/n;.
The regression coefficient b is calculated as usual, except that each p;
must be weighted by the reciprocal of the sample size n; on which it is
247
TABLE 9.11.1
TESTING A LINEAR REGRf.SSrON OF Pi ON THE SCORE (LEPROSY DATA)

Improvement Stationary Worse Total


Degree of --~.---.

Infiltration Marked Moderale Slight

Little 11 27 42 53 II 144
Much (a,) 7 15 16 13 I 52

Tota) (n,) 18 41 58 66 12 196(N)

Pi = oJn; 0.3889 0.3571 0.2759 0.1970 0.08:n 0.26531.)

Score X; 3 2 I 0 -I

based. The numerator and denominator of h are computed as follows:


Num. = l:n,(p, - pH Xi - X I
= "'L.niPiXi - CEl1IP/){L.I1IXj)/I:nj
= l:a,X, - (l:a,)(l:n,XyN
= 66 - (52)(184)/196 = 66 - 48.82 = 17.18
Den. = I'niX_.z - (~niXi)2/N
= 400 - (184)'/196 = 400 -
172.8 = 227.2
This gives h = 17.18/227.2 = 0.0756. Its standard error is
s, = J(pij/Den.) = ,!:(O.2653)(O.7}47)/(227.1): ~ 0.029.1
The normal deviate for testing the null hypothesis Ii = 0 is
Z ~ his. = 0.0756/0.0293 = 2.SHO. P = 0.0098.
f\.Wi'I'\'A\%,r, \\ ~ 'i't\)\ w'(il'Otr":l '0\ fm,'\ '~:Iglfl\, '( '(t\t:':~ , \ ~ ',. "'ShO'w~ \'ua\ \ TI;L'~
regression test is essentially the 3ame as the I-test in section 9.10 of !he
difference between the Inean scores in the Little and Much infiltration
classes. In this example the reg.ression test gave Z = 2.580 while the
I-Iest gave 1 = 1.616(194 d,t:). The dIf1cren,'c in resulls arises because the
two approaches use slightly ditferentl"rge-sample approximations to the
exact distributions of Z and t with these discrete data.
EXAMPLE 9.11.1 Armitage (19\ quote~ the following data by Holmes and Williams
(or the relation in children between l';ize of lon:.d" and {he proportion of children who are
carriers of slr('pltl(·nCl'u.~ p\'og,'ne.~ in the nose

x= Score Given to Size of Tonslls


Types ofC'hlldren (I [ 2 Tota! Children

Carriero; {II,) 19 29 24 72 c4 \
N(ln-carrier~ 497 560 ~f,9 132('1

Tot<ll {II,) 516 589 29~ 1~9R (,V\

Carrier-ratc (p,) 0.0:\68 0.0492 0,0819 O_O)I)O~ (pI


248 Chapter 9: AHrillute Data will. more than On. o.."r•• of Freedom
Calculate: (i) the normal deviate Z for testing the linear regression of the proportion of car·
riers on size ofl005i15. (ti) the value of t for comparing the difference between the mean size
of tonsils in carriers and non-carrien. Ans. (i) z,.. 2.681, (ii) t = 2.686. with 1396 df
EXAMPLE 9. t-l.2-When the regression of p, on X, is us~ as a test criterion, it is of
interest to examine whether the regression is linear, Armitage (19) shows that this can be
done by first computing 'l "'" I:nj(p; - p)2/ pq = {I:aJ'! - A 2/N}/pq. This Xl, with (C - I)
df" measures the total variation among the C values of Pi' The x. Z for linear regression, with
1 d/. is found by squaring Z, since the square of a normal deviate has a x. 2 distribution with
I d! The difference. 'X.€C- 1)2 - XI Z , is a X2 with (C - 2) df for testing the deviations of the
P. from their linear regression on the Xj • Compute this Xl for the data in example 9.1t.l.
Ans. The total Xl is 7.85 with 2 df., while Zl is 7.19 with J df. Thus the,;l for the devia·
tions is 0.66 with 1 df., in agreement with the hypothesis of linearity.

9.12-Heterogeoeity Xl in testing Mendelian ratios. It is often ad-


visable to collect data in several small samples rather than in a single large
one. An example is furnished by some experiments on chlorophyll in-
heritance in maize (I), reported in table 9.12.1. The series cone.isted of
11 samples of progenies of heterozygous green plants, self-fertilized, segre-
gating into dominant green plants and recessive yellow plants. The hypo-
thetical ratio is 3 green to 1 yellow. We shall study the proportion of
yellow-theoretically 1/4.
TABLE 9.12.1
NUMBER OF YELLOW SEEDLINGS IN It SAMPLI:S OF MAIZE

No. in Sample No. Yellow -Proportion Yellow

n, a,

122 24 0.1967
149 39 0.2617
86 18 0.2093
55 13 0.2364
71 17 0.2394
179 38 0.2123
150 3() 0.2000
36 9 0.2500
91 21 0.2308
53 14 0.2642
III 26 0.2342

N~ 1103 A ~ 249 P= 0.22575


Heterogeneity )!l (10 df.)
X' ~ (!:a,p, - Ap)/Pii = (0.5779)/(0.2258)(0.7742) ~ 3.31
Pooled Xl (I dj.)
X/ ~ (iA - Npl- il'/Npq
~ ((249 - 275.751- !)'/(llOl)(O.25HO.75) ~ l.ll

The data may fail to satisfy the simple Mendelian hypothesis in two
ways. First, there may be real differences among the p, (proportion of
yellow) in different samples. This finding points to some additional Source
249
of variability that must be explained before the data can be used as a
crucial test of the Mendelian ratio. Second, the p; may agree with one
another (apart from sampling errors) butthe;r overall proportion p may
disagree with the Mendelian proportion p. The reason may be linkage or
crossing-over, or differential robustness in the dominant and recessive
plants.
The first point is examined by applying to the p; the variance test for
homogeneity of the binomial distribution (section 9.8). The value of X'
shown under table 9.12.1, is 3.31, with 10 df, P about 0.97. The test
gives no reason to suspect rea1 differences among the Pi" We therefore
pool the samples and compare the overall ratio, p = 0.22575, with the
hypothetical p = 0.25, by the X2 test for a binomial proportion (section
8.8). We find X2 (corrected for continuity) = 3.33, P about 0.07 There
is a hint ofa deficiency of the recessive yellows.
In showing the relation between these two tests, the following alge-
braic identity is of interest:
c c c
L n;(p; -
p)2 = ('[ n;l(p - p)~ + '[ ";(p; - P)'
(9.12.1 )
pq pq pq

The quantity n;(p, - p)2/pq measures the discrepancy between the ob-
served p; in the ith sample and the theoretical value p. [fthe null hypothe-
sis is true, this quantity is distributed as X2 with I df and the sum of these
quantities over the C samples (left side of equation 9.12.1) is distributed
as X2 with C df The first term on the right of(9.12.1 ) compares the pooled
ratio p with p, and is distributed as x' with I df The second term on the
right measures the deviations of the p; from their own pooled mean p,
and is distributed as X2 with (C - I) df To sum up, the totalX2 on the
I~ft, with C df, splits into a X2 with I df which compares the pooled
sample p and the theoretical p, and a heterogeneity X2 , with (C - I) df.
which compares the p; among themselves. These X' distributions are of
course followed only approximately unless the n; are large.
In practice, this additive feature is less useful. Unless the poored
sample is large, a correction for continuity in the I df for the pooled x'
is advisable. This destroys the additivity. Secondly, the expression for
the heterogeneity X2 assumes ·that the theoretical ratio p applies in these
data. If there is doubt on this point, the heterogeneity X2 should be
calculated, as in table 9.12.1, with pq in the denominator instead of pq.
In this form the heterogeneity X2 involves no assumption that p = p
(apart from sampling errors).

EXAMPLE 9.12.1--From a population expected to segregate I: 1, four samples with


the following ratios were drawn. 47:33. 40:26. 30:42. 24:34. Note the discrepancies
among the sample ratios. Although the pooled X2 does not indicate any unusual departure
(rom the theoretical ratio, you will find a large heterogeneity Xl equal to 9.01. P = 0.03, for
which some explanation should be sought.
250 Chapt", 9: Attribut. Dolo with more than One Degree of Fr.edom
EXAMPLE 9.J2.2-Fisher (25) applied X2 tests to the experiments conducted by
Mendel in 1863 to test different aspects of his theory, as follows:
Experiment X' df.

Trifactorial 8.94 17
Sifactorial 2.81 8
Gametic ratios 3.67 15
Repeated 2: I test 0.13 I
Show that in random sampling the probability of obtaining a total 1.. 2 lower than that ob-
served is less than 0.005 (use the 12 table). More accurately, the probability is less than I in
2000. Thus. the agreement of the results with Mendel's laws looks too good to be true.
Fisher gives an interesting discussion of possible reasons.

9.I3-The R x C table. If each member of a sample is classified by


one characteristic into R classes, and by a second characteristic into C
classes, the data may be presented in a table with R rows and C columns.
The entry in any of the RC cells is the number of members of the sample
falling into that cell. Strand and Jessen (26) classified a random sample
of farms in Audubon County, Iowa, into three classes (Owned, Rented,
Mixed), according to the tenure status and into three classes (I, II, III),
according to the level of the soil fertility (table 9.13.1).

TABLE 9.13.1
NUMBERS Of FARMS ON THREE SOIL FERTlLITY GROUPS IN AUDUBON CoUNTY, IOWA,
CLASSIFIED ACCORDING TO TENloU

Soil Owned Rented Mixed Total

I f 36 67 49 152
F 36.75 62.92 52.33
-- -- --
f-F -0.75 4.08 -3.33

II f 31 60"- 49 140
F 33.85
---
- --
57.95 48.20
_.-
f- F -2.85 2.05 0.80

III f 58 87 80 225
-_ F 54.40 93.i3 77.47
'- --- -- --
f-i 3.60 -6.13 2.53

Total 125 214 178 517

, '<" (f - F)' ( -0.75)' (2.53)1


l ~ L.. -1'- ~ 36~75- + ... + 77:47 ~ 1.54. df. ~ (R - 1)(C - I) = 4

Before drawing conclusions about the border totals for tenure status,
this question is asked: Are the relative numbers of Owned, Rented. and
Mixed farms in this county the same at the three levels of soil fertility?
251
This question might alternatively be phrased: Is the distribution of the
soil fertility levels the same for Owned, Rented, and Mixed farms? (If a
little reflection does not make it clear that these two questions are equiva-
lent, 'see example 9,13.1.) Sometimes the question is put more succinctly
as: Is tenure status independent of fertility level?
The X' test for the 2 x C table extends naturally to this situation.
As before,
x2 = '£(f - F)'/F,
wherefis the observed frequency in any cell and Fthe frequency expected
if the null hypothesis of independence holds.
As before, the expected frequency for any cell is computed from the
border totals in the corresponding row and column:
F = (row total)(column total)
n
row total
= . (column total)
n
Examples: For the first row,

row total = 152 =


0 29400
n 517'
FI = (0.29400)(125) = 36.75
F2 = (0.29400)(214) = 62.92
F3 = (0.29400)( 178) = 52.33
This procedure makes the computation easy with a calculating machine.
For verification. notice that (i) the sum of the F in any row or column is
eQ)lalto the observed total, and consequently (ii) the sum of the deviations
in each row and in each column is zero.
The facts just stated dictate the number of degrees of freedom. One
is free to put R - I expected frequencies in a column, but the remain-
ing cell is then fixed as the column total minus the sum of the R - I values
of F Similarly, when we have inserted expected frequencies lnthis way
in (e - I) columns, the expected frequencies in the las.,! column are fixed.
Therefore, df = (R - II(C - I). .
The calculation of x' is given in the table. Since P > 0.8, the null
hypothesis is not rejected. If you do not need to examine the contribution
of the individual cells of X', up H> half the time in computation can be
saved by a shortcut deVIsed by P. H. Leslie (27). This is especially useful
if many tables are to be calculated.
When X2 is significant, the next step is to study the nature of the de-
parture from independence in more detail. Examination of the cells in
which the contribution to X' is greatest, taking note of the signs of the
deviations (f - F), furnishes clues, but these are hard to interpret because
the deviations in different cells are correlated. Computation of the per-
252 Chap,er 9: Affribu'e Dola wi,h more ,han On. D.llr•• of Fr.eJom
centage distribution of the row classification within each column, fol-
lowed by a scrutiny of the changes from column to column, may be more
informative. Further X' tests may help. For instance, if the percentage
distribution of the row classification appears the same in two columns, a
X' test for these two columns may confirm this. The two columns can
then be combined for comparison with other columns. Examples 9.13.2,
3, 4, 5 illustrate this approach.
EXAMPLE 9.13.1 "-Show that jf the expected distribution of the column classification
is the same in every row, then the expected distribution of the row classification is the sarr'!'"
in every COIUffirl. For the ith row. let F:.I' F.z •... F:.c be the expected numbers in the respec-
tive columns. Let Fi~ "" a2Fil' Fn =:: Q)Fil' ... Fie == aCFH' Then the numbersu2. Q3,' . Qe
must be the same in every row, since the expected distribution of the column c1asslficatlon
is the same in every tow. Now the expected row distribution in the first column is FII •
F 21 •. . F RJ . In the second column it is F;1 ""'" Q 2 FIl , F22 ~ Q2F21' . FR2 = a2F,O' Since
02 is a constant multiplier, thjs is the same distribution as in the tirst column, and similarly
for any other column.
EXAMPLE 9.13.2-~ln a study of the relation between blood type and disease, large
samples'of patients with peptic ulcer, patients with gastric cancer, and control persons free
from these diseases were classified as to blood type (0, A, B, AB). (n this ex.a.mple, the
relc.ttively small numbers. of AB patients were omitted for simplkity. The observed numbers.
are as follows:

Blood Type I Peptic Ulcer Gastric Cancer Controls Totals

o ; 983 38) 2892 4528


~ I 679
416
84
2625 3720

~~~
570 788

883 6087 8766

Compute 1: to test the null hypothesis that the distribution of blood types is the samc> for
the three samples. Ans.·C ~ 40.54, '"' dj, P very small.
EXAMPLE 9.13.3-To examine this Question further. compute the percentage dis-
tribution of blood types.Jor each sample, as shown below.
_----_.
Blood Type Peptic Ulcer Gastric Cancer Controls

0 54.7 43.4 47.5


A 37.8 47.1 43.1
B 15 9.5 9.4

Totals 100.0 100.0 100.0

This suggests (i) there is little difference between (tte blood type distributions for gastric
cancer patients and controls, (ii) peptic ulcer patients differ principally in having an excess of
patients of type O. Going back to the frequencies in example 9.13.2, test the hypothesis
that the blood type distribution is the same for gastric cancer patients and controls. Ans.
X' ~ 5.64 (2 df)· P about 0.06.

EXAMPLE 9. I 3.4--Col11bine the gastric cancer and control samples. Test (i) whether
the distribution of A and B types is the same in this combined sample a.s in the peptic ulcer
sample (omit the 0 types). Ans..·; -= 0.68 (1 df) P > 0.1. (ii) Test whether proportion
253
or 0 types versus A+ B types is the same for the combined sample as for the gastric cancer
samples. Ans.·i = 34.29 (l df). P very small. To sum up, the high value of the original
4 df Xl is due primarily to an excess of 0 types among the peptic ulcer patients.
EXAMPLE 9.13.5-The preceding X2 tests may be summarized as follows:

Comparison d.f. x'


0, A. B types in gastric cancer (g) and controls (c) 2 5.64
A, B types in peptic ulcer and combined (g. c) I 0.68
0, A and B types in peptic ulcer and combined (g, c) I 34.29

Total 4 40.61

The total X2 • 40.61, is close to the original"l, 40.54, because we have broken down the original
4 d.f. into a series of independent operations that account for all 4 df. The difference be-
tween 40.61 and 40.54, however, is not just a rounding error; the two quantities differ a little
algebraically.

9.14-8ot. of 2 X 2 tables. Sometimes the task is to combine the


evidence from a number of 2 x 2 tables. The same two treatments or
types of subject may have been compared in different studies, and it is
desired to summarize the combined data. Alternatively, the results of a
single investigation are often subclassified by the levels of a factor or
variable that is thought to influence the results. The data in table 9.14.1.
made available by Dr. Martha Rogers (in 9), are of this type.
The data form part of a study of the possible relationship between
complications of pregnancy of mothers and behavior problems in children.
The comparison is between mothers of children in Baltimore schools who
had been referred by their teachers as behavior problems and mothers of
control children not sO referred. For each mother it was recorded whether
TABLE 9.14.1
A SEt' Of THP.E£ 2 x 2 TABUS: NUMBERS OF MOTHEltS WrtK PREVIOUS INFA.NT LOSSES

No. of Mothers with:


Birth Type of
Order Children Losses No Losses Total % Loss X' (I d.f.)

Problems 20 &2
2
Controls 10 54
\02 = "I·r...
64 = nil . 19.6"'" PI'
15.6"'" '12
Total 30 136 166=n, 18.1"'" p, 0.42

3-4 Problems 26 41 67 = 1121 , 38.8:;: P21


Controls 16 30 46 = n 21 34.8:;: fin
i

Total 42 71 113 = n 2 37.2 = P2 0.19


I
I
5+ Problems 27 22 49 = n ll 55.1 = Pl'
Controls 14 23 37 = nll I 37.8 = PH [
Total 41 45 47.7 = p, 2.52
I I
86 =nl
I ,j
254 Chapte, 9: Alfribute Data with more than One Degre& of Freeclom
she had suffered any infant losses (e.g .. stillbirths) prior to the birth of the
child. Since these loss rates increase with the birth order of the child, as
table 9.14.1 shows, and since the two samples might not be comparable
in the distributions of birth orders. the data were examined separately for
three birth-order classes. This is a common type of precaution.
Each of the three 2 x 2 tables is first inspected separately. None of
the X' values in a single table, shown at the right, approaches the 5% sig-
nificance level. Note, however, that in all three tables the percentage of
mothers with previous losses is higher in the problem children than in the
controls. We seek a test sensitive in detecting a population difference
that is consistently in one direction, although it may not show up clearly
in the individual tables.
A simple method is to compute X (the square root of X') in each table.
Give any Xi the same sign as.the difference d i = Pi! - Pi2' and add the
Xi values. From table 9.14.1.
X, + X, + Xl = +0.650 + 0.436 + 1.587 = +2.673,
each x, being + because all the differences are +.
Under Ho, any Xi is a standard normal deviate: hence, the sum of the
3 is is a normal deviate with S.D. = .j3. The test criterion is };xil.jg,
where g is the number df tables. In this ca.e we have 2.673!.j3 = 1.54.
In the normal table, the two-tailed P value is just above O.ID. For this
test the is should not be corrected for continuity.
This test is satisfactory if (i) the n, do not vary from table to table by·
more than a ratio of 2 to I, and (ii) the p, are in the range 20~; to 75%.
If the n, vary greatly, this test gives too much weight to the small tables,
which have relatively poor power to reveal a falsity in the N.H. If the
P's in some tables are close to zero or 100%, while others are around 50%,
the popUlation differences hi are likely to be related to the level of the Pij'
Suppose that we are comparing the proportions of cases in which body
injury is suffered in auto accidents by seat-belt wearers and non-wearers.
The accidents have been classified by severity of impact into mild, mod-
erate, severe, extreme, giving four 2 x 2 tables. Under the mild impacts,
both Pit and P12 may be small and 0 , also small, since injury rarely occurs
with mild impact. Under extreme impact, P., P.,
and may both be close
to 100%. making 8. also small. The large b's may occur in the two
middle tables where the P's are nearer 50%.
In applications of this type, two mathematical models have been
used to describe how 0, may be expected to change as p" changes. One
model supposes that the difference between the two populations is con-
stant on a IO?,it scale. The logit of a proportion P is log, (P!q). A constant
difference on the logit scale means that log, (Pil!qil) - log, (Pi2!qi2) is
constant as Pi2 varies. The second model postulates that the difference is
constant on a normal deviate (2) scale. The value of 2 corresponding to
any proportion P is such that the area of a ,tandard normal curve to the
255
left of Z is p. For instance, Z = 0 for p = 0.5, Z = 1.282 for p = 0.9,
Z = -1.282 for p = 0.1.
To illustrate the meaning of a constant difference on these trans-
formed scales, table 9.14.2 shows the size of difference on the original
percentage scale that corresponds to a constant difference on (a) the logit
scale (b) the normal deviate scale. The size of tile difference was chosen
to equal 20% at pz = 50%. Note that (i) the differences diminish towards
both ends ofthep scale as in the seat belts example, (ii) the two transforma-
tions do not differ greatly.
TABLE 9.14.2
SIZE Of DIFFERENCE {J = Pl- P2 }lOR A RANGE OF VALUES OF P2

Pz% 1 5 10 30 50 70 90T 95 I 99
Constant logit
Constant Z
2.6
1.3
8.1
6.0
12.4
10.6
20.0
20.0
20.0
20.0
15.3
14.5
6.4
5.5
I 2.83.5 I 0.8
0,6

A test that gives appropriate weight to tables with large n, and is


sensitive if differences are constant on a logit or a Z scale was developed
by Cochran (9). If p, is the combined percentage in the ith table, and

we compute

and refer to the normal table. For the data in table 9.14.1 the computa-
tions are as follows (with the d, in proportions to keep the numbers
smaller).

Birth
Order Wi d, Wid, P. fjAi ",M,
2 39.3 +0.040 + 1.57 0.181 1l.14X2 S.8.~4
3-4 27.3 +0.040 +l.09 0.372 0.2336 6.377
5+ 21.1 +0.173 +3.65 0.477 0.2494 5.262

Sum +6.31 17.463

The test criterion is 6.311.)(17.463) = LSI. This agrees closely with


the value 1.54found by the LX test, for which these tables are quite suitahle.
There is another way of computing this test. In the jth table. let 0,
be the observed number of Problems losses and E, the expected number
under H •. For birth order 2 (table 9.14.1) . .0, = 20. E, = {~J)(\02)/I66
256 Chapter 9: Attribute Data with more than One Degree 01 Freedom
TABLE 9.14.3
THE MANTEL·HAESSZEL TEST FOR THE. INFANT Loss DATA IN TABLE 9.14.1

Birth Order 0, E, nntlilCnC;1/n,2(n; - 1)

2 20 18.43 5.858
3-4 26 24.90 6.426
5+ 27 23.36 5.321

Sum 73 66.69 17.605

Z - (73 - 66.69 - tlij17.605 - 1.38

= 18.43. Then (0, - E,) = + 1.57, which is the same as w,d,. This re-
sult may be shown by algebra t() hold in any 2 x 2 table. The criterion can
therefore be written
:I:(O, - E,)I.j!.w,ft.1/,
This form of the test has been presented by Mantel and Haenszel
(28,29), with two refinements that are worthwhile when the n's are small.
First, the variance of w,d, or (0, - E,) on H. is not w;{Vl, but the slightly
larger quantity n"n"p,II,/(n" + nil - I). If the margins of the ;2 x 2 table
are nil' ni 2. cll • and cu, this variance can be computed as
n U njlCj1Ci2 /n/(n, - 1), (n; = nu + ni2).
a form that is convenient in small tables.
Secondly, a correction for continuity can be applied by subtracting
1/2 from the absolute value of :I:(O, - E,). This version of the test is
shown in table 9.14.3. The correction for continuity makes a noticeable
difference even with samples of this size. -
The analysis of proportions is discussed further in sections 16.8-16.12.

REFERENCES
1. E. W. LlNDSTROM. Cornell Agrie. Exp. StD" Memoir J3 (1918).
2. A. W. F. EDWAIU>S. Ann. Hum. Gen., 24:309 (1960).
3. l. H. EDWARDS. Ann.'lium. Gen., 25:89 (1%1),
4. C. W. UGGATf. Comptes rendus de fassociation inlernationaie tlessais de semencts,
5:27 (1935).
5. D. J. CAFfREY and C. E. S~. Bureau of Entomology and Plant Quarantine. USDA
(Baton Rouge) (1934).
6. W. G. CocHRAN. A,.,. Millh. Slatist., 23:315 (1952).
7. 1. M. OuKllAVA.lI.TI and C. R. RAo. Sankhyo, 21: 315 (l959).
8. W. G. CoCHllAN. J. R. St.tist. Soc. Suppt., 3:49(1936).
9. W. G. COCHR.... N. Biometrics, 10:417 (1954).
10. G. W. SNEDECOR and M. R.lIlWIS. Iowa SI4ft Coil. J. Sci .. 8: 75 (1933).
R. C. LEWONTIN and J. F£lSENSTEIN. Biometrics. 21: 19 (1965),
12. J. B. S. HALDANE. Biometrika. 33:234(1943-46).
13. J. O. IRWiN and E. A. CHEESEMAN. J. K Statist. Soc. Suppl. 6: 174 (19'39).
14. L C. BURNETT. M.S. Thesis. low;! State College (1906).
15. G. C. DECKER and F. ANDRE. lo ....a State J. Sci .. 10:403 (1936).
257
16. A. W. KIMBALl.. Biometrics, 10:452 (1952).
17. D. J. BARTHOLOMEW. Biometrika, 46:328 (1959).
18. F. YATES. Biometrika, 35'.116(1948),
19. P. ARMITAGE. BiomelriCJ, II :375 (1955).
20. K. PEARSON. Biometrika, 5: 105 (1905-06),
21. l. D. J. BROSS. Biometrics, 14: 18 0958).
22. J. IPSEN. Biometrics, 12:465 (1955).
23. R. A. FISHER. Statistical Methods for Re.5earch Workers. Oliver and Boyd, Edin-
burgh (1941).
24. R. P. ABELSON and J. W. TUKEY. Proc. Soc. Statist. Sect. Amer. SlUtis!. Ass. (1959),
25. R. A. FISHER. Ann. Sci., I: 117 (1936).
26. N. V. Strand ..md R. J. Jessen. IOJ.\'uAgr. Exp. Slat. Res. Bul. 315 (1943).
27. P. H. LESLIE. Biometrics. 7: 283 (1951).
2%. N. MkN"I'il.. and W. H ....EN'.>2H.. J. Nat. eanar Jnst., 22:,19 (1959).
29. N. MANTEL. J. Amer. Slatisl. Ass., 58:690(1963).
* CHAPTER TEN

One-way classifications.
Analysis of variance

IO.I-Extension from two samples to many. Statistical methods for


two independent samples were presented in chapter 4, but the needs of the
investigator, are seldom confined to the comparison of two samples only.
For attribute data, the extension to more than two samples was made in
the preceding chapter. We are now ready to do the same for measure-
ment data,
First, recall the analysis used in the comparison of two samples. In
the numerical example (section 4.9, p. 102), the comb weights of two
samples of II chicks were compared, one sample having received'sex
hormone A, the other sex hormone C. Briefly, the principal steps in the
analysis were as follows: (i) the mean comb weights X" X2 were computed.
(ii) the within-sample sum of squares of deviations LX 2 , with 10 d.!,
was found for each sample, (iii) a pooled estimate ,,2 of the within-sample
variance was obtained by adding the two values of 1:x 2 and dividing by
the sum of the df., 20, (iv) the standard error of the mean difference,
X, - X2, was calcula~d as ..}(2s2/n), where n = II is the size of eacb
sample. (v) finally, a test of the null hypothesis 1', = 1'2 and confidence
limits for 1', - 1'2 were given by the result that the Quantity
{X', - X2 - (1', - 1'2)}i..}(2s'jn)
follows the (-distribution with 20 df.
In the next section we apply this method to an experiment with four
treatments. i.e., four independent samples.
IO.2-An experiment witb' four samples. During cooking, doughnuts
absorb fat in various amounts. Lowe (I) wished to learn if the amount
absorbed depends on the type of fat used. For each of four fats, six
batches of doughnuts were prepared, a batch consisting of 24 doughnuts.
The data in table 10.2.1 are the grams of fat absorbed per batch, coded by
deducting 100 grams to give simpler figures. Data of this kind are called
a single or one~ll'ay classification, each fat representing one class.
Before beginning the analysis, note that the totals for the four fats
dift'er substantially, from 372 for fat 4 to 510 for fat 2. Indeed, there is a
258
259
TABLE 10.2.1
GRAMS Of FAT ABsoilBED PElt BATCH (MINUS 100 GRAMS)

Fat 1 2 ) 4 Total

64 78 75 55
72 91 93 66
68 97 78 49
77 82 71 64
56 85 63 70
95 77 76 68

l:X 432 510 456 372 1,7m = G


X 72 85 76 62 295
:EX' 31,994 43,652 35,144 23,402 134,192
(:EX)'/" 31,104 43,350 34,656 23,064 132,174

:Ex' 890 302 488 338 2,018


df. 5 5 5 5 20

Pooled " = 2,018/20 = 100.9


'. = ../(2s'/") = ../~(2:;")(~I()()~.9~)/~6 _ 5.80

clear separation between the individual results for fats 4 and 2, the highest
value given by fat 4 being 70, while the lowest for fat 2 is 77. Every other
pair of samples, however, shows some overlap.
Proceeding as in the case of two samples, we calculate for each sample
the mean X and the sum of squares of deviations !:x2 , as shown under
table 10.2.1. We then fonn a pooled estimate S2 of the within-sample
variance. Since each sample provides 5 df for LX', the pooled S2 = 100.9
has 20 (if. This pooling involves. of course, the assumption that the vari-
ance between batches is the same for each fat. The standard error of the
mean of any batch is JS'i6 = 4.10 grams.
Thus far, the only new problem is that there are four means to com-
pare instead of two. The comparisons that are of interest are not neces-
sarily confined to the differences Xi - Xj between pairs of means: their
exact nature will depend on the questions that the experiment is intended
to answer. For instance, if fats I and 2 were animal fats and fats 3 and 4
vegetable fats, we might be particularly interested in the difference
(X t + X,)/2 - (X, + X.)/2. A rule for making planned comparisons of
this nature is outlined in section 10.7, with further discussion in sections
10.8, 10.9.
Before considering the comparison of means, we present an alterna-
tive method of doing the preliminary calculations in this section. This
method, of great utility and flexibility, is known as the analysis of variance
and waS developed by Fisher in the 1920's. The analysis of variance per-
forms two functions: '
1. It is an elegant and slightly quicker way of computing the pooled
Sl, In a single classification this advantage' in speed is minor, but in the

17
260 Chapter 10: One-Way Classifications. Analysis of Variance
more complex classifications studied later, the analysis of variance is
the only simple and reliable method of determining the appropriate
pooled error variance S2.
2, It provides a new test, the F-test, This is a single test of the null
hypothesis that the population means 1'" fll' fl" fl., for the four fats are
identical. This test is often useful in a preliminary inspection of the results
and has many subsequent applications.
EXAMPLE IO.2.I-Here are some data selected for easy computation. Calculate the
"l:
pooled and state how many df. it has.
================
Sample number
2 3 4

II IJ 21 W
4 9 18 4
6 14 IS 19

Am. Sl = 21.5. with S dJ.

10.3-The analysis of variance. In the doughnut example, suppose


for a moment that there are no diffe,ences between the average amounts
absorbed for the four fats. In this situation, all 24 observations are dis-
tributed about a common mean fl with variance (f'.
The analysis of variance develops from the fact that we can make
three different estimates of ,,' from the data in table 10.2.1. Since we are
assuming that all 24 observations come from the same popUlation, we
can compute the total sum of squares of deviations for the 24 observations
as
64' + 72' + 68' + ... + 70' + 68' - (1770)'/24
= 134,192 - 130,538 ~ 3654 (10.3.1)
This sum of squares has 23 d./ The mean square, 3654/23 ~ 158.9, is
the first estimate of ([' .
The second estimate is the pooled s' already obtained. Within each
fat, we computed the sum of squares between batches (890, 302, etc.),
each with 5 df. These sums of squares were added to give

890 + 302 + 488 + 338 = 2018 (10.3.2)


This quantity is called the sum of squares between batches within fats, or
more concisely the sum of squares within fats. The sum of squares is
divid-ed by its dj., 20, to give the second estimate,.' = 2,018/20 = 100.9.
For the third estimate, consider the means for the four fats, 72, 85,
76, and 62. These are also estimates of fl, but have variances ,,'/6, since
they are means of samples of 6. Their sum of squares of deviations is

72 ' + 85' + 76' + 62' - (295)'/4 '" 272.75


261
with 3 df, The mean square, 272.75/3, is an estimate of u'/6. Conse-
quently, if we multiply by 6, we have the third estimate of u'. We shall
accomplish this by multiplying the sum of squares by 6, giving

6{72' + 85' + 76' + 62' - (295)' /4} = 1636 (10.3.3)

the mean square being 1636/3 = 545.3.


Since the total for any fat is six times the fat means, this sum of squares
can be computed from the fat totals as

432' + 510' + 456' + 372' (l77W


6 24
= 132,174 - 130,538 = 1636 (10.3.4)

To verify this alternative form of calculation, note that 432'/6 = (6 x 72)'/6


= 6(72)', while (1770)'/24 = (6 x 295)'/24 = 6(295)'/4. This sum of
squares is called the sum of squares beMeen fats.
Now list the df, and the sums of squares in (10.3.3), (10.3.2), and.
(10.3.1) as follows:

Source of Variation Degrees of Freedom Sum of Squares

Between fats 3 1,636


Between batches within fats 20 2,018

Total ),654

Notice a new and imponant-resuit-: the df. amt-tl1estimsof squa~~s for


the two components (between fats and within fats) add to the correspond-
ing total' figures. These resuhs hold in any single Ci'as;ification. Tire
result for the df is not hard to verify. With a classes and n observations
per class. the df are (a - I) for Between fats, a(n - I) for Within fats,
and (all - I) for the total. But
"

(a - 1) + a(n - 1) = a-I -to an - a = .an ~ 1


The result for the SUms of squares follows from an algebraic identity
(example 10.3.5). Because of this relation, the standard practice in the
analysis of variance is to compute only the total sum of squares and the
sum of squares Between fats. The sum of squares Within fats, leading to
the pooled s', is obtained by subtraction.
Table 10.3.1 shows the usual analysis of variance table for the dough-
nut data, with general computing instructions for a classes (rats) with n
observations per class. The symbol T denotes a typical class total, while
G = 1: T = 1:l:X (summed over both rows and columns) is the grand total.
The first step is to calculate the correction/or the mean,
262 ClKJple' '0: One.Way Cla"ilicalion •. Analysis of Variance
e = G'/an = (1770)'/24 = 130.538
This is done because e occurs both in formula (1O.3.t) for the lotal sum
of squares and in formula (10.3.4) for the sum of squares between fats.
The remaining steps should be clear from table 10.3.1.
TABLE 10.3.1
FOJU(ULAS FOIl CALCULATING THE ANALYSIS OF V AlliANCE TABLE
(ILLUSTRATED BY 1HE DoUGHNUT DATA)

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Between classes (fats) a-I = 3 (r.T'/01 - C = 1.636 545.3


Within classes (fats) 0(0 - I) = 20 Subtract = 2,018 100.9

Total r.r.x' - C = 3.654

Since the analysis of variance table is unfamiliar at first. the beginner


should work a number of examples. The role of the mean square between
fats, which is needed for the F-test, is explained in the next section.
EXAMPLE 1O.3.t-From the formulas in table 10.3.1, compute the analysis of variance
for the simple data in example 10.2.1. Verify that you obtain 21.5 for the pooled $2, as
found by the method of example 10.2.1.

Source of Variation df. Sum of Squares Mean Square

Between samples l 186 62.0


Within samples 8 172 21.5

Total II 358 32.5

~XA.~"H~i...E !a.J2-As part of a larger experiment (2), three ievels of vitamin BIl were
compared, each level being fed to tnn:t Jint:rC:li~ pigs. ine average daily gains in weight of
the pigs (up to 7S lbs. live weight) were as follows:

Level of 8 u (mg.jlb. ration)


5 10 20

1.52, 1.63 1.44


1.56 1.57 1.52
1.54 1.54 1.63

Analyze the variance as follows:

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Between levels 2 0.0042 0.0021


Within levels 6 0.0232 0.0039

Total 8 0.0274 0.0034


------------_
Hint: If you subtract 1.00 from each gain (or 1.44 if you prefer it) you will save time. Sub-
traction of a common figure from every observation does not alter any of the results in the
analysis of variance table. ...
263
EXAMPLE 10.3.3-10 table 9.4.1 there were recorded the Dumber of loopers (insect
larvae) on 50 cabbage plants per plot after the application offive treatments to each of four
plots. The numbers were:

Treatment
2 3 4 5

II 6 8 14 7
4 4 6 27 4
4 3 4 8 9
5 6 II 18 14

With counts like these. there is some question whether the assumptions required for the
analysis of variance are valid. But for iUustration, analyze the variance as follows:

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Between treatments 4 359.30 89.82


Within treatments 15 311.25 20.75

Total 19 670.55

EXAMPLE 10.3.4-The per<:entage of clean wool in seven bags was estimated by


taking three batches at random from each bag. The percentages of clean wool in the batches
were as follows:

Bag Number
2 3 4 5 6 7

41.8 33.0 38.5 43.7 34.2 32.6 36.2


38.9 37.5 35.9 38.9 38.6 38.4 33.4
36.1 33.1 33.9 36.3 40.2 34.8 37.9

Ca]culate the mean squares for bags 01.11) and batches within bags (8.22).

EXAMPLE 1O.3.5-To prove the result that the sums of squares within and between
classes add to the total sum of squares. we use a no~tion that has become common for this
type of data. Let Xi) be the observation for thejth member of the ith class. XI' is the total
of the ith class and X .. the grand total.
The sum of squares within the ith class is ' ......

j- 1
L X i/ - Xi·l/n

On adding this quantity over all classes to get the numerator of the pooled 51, we obtain.
for the sum of squares within classes

L L xu' - L
. X,'/n III
1= I j~ I i" 1

The sum of squares between classes is computed as



LX/II-X.Z"m' III
i~ I
264 Chopter 10: One-Way Clos.iIlcation •. Analysis of Variance
The sum of (J) and (2) gives
. .
L: i-I
L X'/ - X..'/an
i_I

But this is the total sum of squares of deviations for the overaU mean.

lO.4-Effect of differences between the population means. If the


popUlation meam for the four fats are identical, we have seen that the
mean square between fats, 545.3, and the mean square within fats, 100.9,
are both estimates of the population variance ,,'. What happens when the
population means are different? In order to illustrate from a simple
example in which you can easily verify the calculations, we drew (using
a table of random normal deviates) six observations normally distributed
with population mean," = 5 and (J = l. These were arranged in three
sets of two observations, to simulate an experiment with a = 3 treatments
and n = 2 observations per treatment.

TABLE 10.4.1
A SIMULATED EXPERIMENT WITH THREE TREATMENTS A:-<n
Two OBSERVATIONS PER TREATMENT

Data Analysis of Variance

Case 1. }Jl = 1-12 =}J3 = 5 df. 5.5. M.S.

Treatment Treatments 2 1.66 0.83


I 2 3 Error 3 3.37 1.12

4.6 3.3 6.3 Total 5 5.03


5.2 4.7 4.2

9.8 8.0 10.S

Case II. 11~ = 4'-/).2 "" 5, 11~ =7 df. S.S. M.S.

Treatment Treatments 2 14.53 7.26


I 2 3 Error 3 3.37 1.12

3.6 3.3 8.3 Total 5 17.90


4.2 4.7 6.2

7.8 8.0 14.5

Case Ill. PI =::: 3, J..t]. ,,: 5, JlJ = 9 df S.S. M.S.

Treatment Treatments 2 46.06 23.03


I 2 3 Error 3 3.37 1.12

2.6 3.1 10.3 Total 5 49.43


3.2 4.7 8.2
. ~

5.8 8.0 18.5


265
The data and the analysis of variance appear as Case I at the top of
table 1004.1. In the analysis of variance table, the Between classes sum of
squares is labeled Treatments, and the Within classes sum of squares is
labeled Error. This terminology is common in planned experiments.
The mean squares, 0.83 for Treatments and 1.12 for Error, are both
estimates of 0'2 = 1.
In Case II we subtracted I from each observation for treatment I
and added 2 to each observation for treatment 3. This simulates an ex-
periment with real differences in the effects of the treatments, the popula-
tion means being III = 4, Il, = 5, 113 = 7. In the analysis of variance,
notice that the Error Sum of squares and mean square are unchanged.
This should not be surprising, because the Error 5.5. is the pooled LX'
within treatments, and subtracting any constant from all the observations
in a treatment has no effect on LX'. The Treatments mean square has,
however, increased from 0.83 in Case I to 7.26 in Case II.
Case III represents an experiment with larger differences between
treatments. Each original observation for treatment I was reduced by 2.
and each observation for treatment 3 was increased by 4. The means are
now III = 3, 112 = 5, 113 = 9. As before, the Error mean square is un-
changed. The Treatments mean square has increased to 23.03. Note
that the samples for the three treatments have now moved apart, so that
there is no overlap.
When the means Il, differ, it can be proved that the Treatments mean
s-quare is an unbiased estimate of
,,2 + n L (p, - fi)'/(a - 1) 110.4.1 )
j=1

In Case II, with III = 4, 5, 7, L(p, - ji)2 is 4.67, while n and (a - I) are
both 2 and ,,' = I, so that (1004.1) becomes I + 4.67 = 5.67. Thus the
Treatments mean square, 7.26, is an unbiased esttmate of5.67. If we drew
a large number of samples and calculated the Treatments mean square for
Case Ii for each sample, their average should be close to 5.67.
In Case Ill, L(Il, - fi)' is 18.67, so that the Treatments mean square,
23.03, is an estimate of the popUlation value 19.67.
lO,S-The variance ratio, F. These resultsslIggest that the quantity,
Treatments mean square Mean square between classes
F = = •
Error mean square Mean square within classes
should be a good criterion for testing the null hypothesis that the popula-
tion means are the same in all classes. The value of F should be around
I when the null hypothesis holds, and should become large when the Il,
differ substantially. The distribution was first tabulated by Fisher in the
form z = lo&v' F. In honor of Fisher. the criterion was named F by
Snedecor (3). Fisher and Yates (4) designate F as the variance ratio.
In Case I, Fis 0.83/1.12 = 0.74. In Case II. Fincreases to 7.261.12
= 6048 and in Case III to 23.03/1.12 = 20.56. When you have learned
266 Chap,., 10: One. Way Clallillcalion•. Analysi. 01 Varianc.
how to read the F·table, you will find that in Case II, F, which has 2 and
3 degrees offreedom, is significant at the 10% level but not at the 5% level.
In Case III, F is significant at the 5% level.
To give some idea of the distribution of F when the null hypothesis
holds, a sampling experiment was conducted. Sets of 100 observations
were drawn at random from the table of pig gains (table 3.2.1, p. 67),
which simulates a normal population with J.I = 30, a = 10. Each set wal>
divided into a = 10 classes, each with n = 10 observations. The F ratio
therefore h;js 9 df. in the numerator and 90 df. in the denominator.

TABLE 10.5.1
OJ5TRIBUTJON OF FIN 100 SAMPLES FROM TABLE 3.2.1
(Degrees of freedom 9 and 90)

Class Interval Frequency Class Interval Frequency

O. ..{).24 7 1.5Q..1.74 5
O.25..{).49 16 I. 75-1.99 2
O.5()..{).74 16 2.OQ..2.24 4
O.75..{).99 26 2.25-2.49 2
1.OQ..1.24 II 2.5Q..2.74 2
1.25-1.49 8 2.75-2.99 1

Table 10,5.1 displays the sampling distribution of 100 values of F.


One notices first the skewness; a concentration of small values and a long
tail of larger values. Next, observ, that 65 of the F are less than I. If
you remember that botb terms of the ratio are estimates of "l, you may
be surprised that I is not the median. The mean, calculated as with
grouped data, is 0.96: the theoretical mean is slightly greater than 1.
Finally, 5% of the values lie beyond 2.25 and 1% beyond 2.75, so that these
points are esthoates of the 5% and I % levels of the theoretical distribution.
Table A I( Part t, eontains the theoretical 5% and 1% points of Ffor
convenient combinations of degrees of freedom. Across the top of the
table is found!. degrees of freedom corresponding to the number oftreat-
ments (classes):/t = a-I. At the left is!" the degrees of freedom for
individuals, a(n - I). Since the F-table is extensively used. table A 14,
Part II, gives the 25%, 10%, 2.5%, and 0.5% levels.
To find the 5?1" and 1%points for the sampling experiment, look in the
column headed by.!, = 9 and down to the rows!, = 80 and 100. The re-
quired points are 1.98 and 2.62, halfway between those in the table. To be
compared with these are the points experimentally obtained in table 10.5.1,
2.25 and 2.75; not bad estimates from a sample of 100 experiments. In
order to check the sampling distribution more exactly, we went back to
the original calculations and found 8% of the sample F's beyond the 5%
point and 2% beyond the 1%. This gives some idea of the variation to be
encountered in sampling.
For the doughnut experiment, the hypothesis set up-that the batches
167
are random samples from populations with the same II--may be judged by
means of table A 14. From the analysis of variance in table 10.3.1,
F = 545.3/100.9 = 5.40
For f, = 3 and j; = 20, the I % point in the new table is 4.94. Thus from
the distribution specified in the hypothesis there is less than one chance
in 100 of drawing a sample having a larger value of F. Evidently the
samples come from populations with different p's. The conclusion is that
the fats have different capabilities for being absQrbed by doughnuts.
EXAMPLE JO,5.1--Four tropical feedstuffs were each fed to a lot of 5 baby chicks (9),
The gains in weight were;

Lot I 55 49 42
30
21
89
52
63
2 I 61 112
3 42 97 81 95 92
4 ! 169 137 169 85 154

Analyze the variance and test the equality of the p. ADs. Mean squares: (i) lots. 8,74;;
(i1) chicks within lots. 722. F = 12.1. Since the sample Fis far beyond the tabular I % point,
there is lillIe doubt that the feedstuff populations have different /J's.
EXAMPLE 10.5.2-10 the wool data of example 10.3.4, test the hypothesis that the
bags. are aU from populauons with a common mean. Ans. F = 1.3S, FO•03 = 2.8S, There
is not . . ttong evidence against the hypothesis-the bags may all have the same percentage of
dean wool.
EXAMPLE 10,5.3-·ln the vitamin Bt2 experime-nt of example 10.3.2. the mean gains
for the three levels differ less than is to be expected from the mean square within levels.
Allhough there is no reason for computing it. the value of Fis 0.54. There is, of course, no
evidence of differences among the Ji.
EXAMPLE 10.5.4--10 example 10.3.3, test the hypothesis that the treatments have no
effec.:t on the number of loopers. Aos. F = 4.33. What do you conclude?

IO.6-Analysis of variance with only two classes. When there are only
two classes. the F-test is equivalent 10 the t-Ie~t which we used in chapter
4 to compare the two means. With two classes, the relation F = [2 holds.
We shall verify this by computing the analysis of variance for the numeri-
cal example in table 4.9.1, p. 103. The pooled .,' = 16,220/20 = 811,
has already been computed in table 4.9.1. To complete the analysis of
variance, compute the Between samples sum of squares, Since the sample
totals were 1067 and 616, with n = II, the sum of squares is.

(1067)' + (616)' (1683)'


-;-:-'--- - - - - = 9245.5 (10.6.1)
II 22
With only two samples. this sum of squares is obtained more quickly as
i~X, - LX,)' 11067 - 616)'
= 9245.) 110.62)
2n 1211111
268 Chapter 10: One- Way Classifications. Analysis of Vorianc.
TABLE 10.6.1
ANALYSIS OF VARIANCE OF CHICK EXPERIMENT, TABLE 4.9.1

Source of Variation ! Degrees of Freedom Sum of Squares Mean Square


Between samples
Within sampc5
I 20
1

F ~ 9,245.5/811.0 ~ 11.40
9,245.5
16,220.0

.jF~ 3.38 ~ t
9,245.5
811.0

-------
Table 10.6.1 shows the analysis of variance and the value of F, 11.40.
Note that JF= 3.38, the value of I found in table 4.9.1. Further,
in the F table with!, = I, the significance levels are the squares of those
in the I table for the same!2. While it is a matter of choice which one is
used, the fact that we are nearly always interested in the size and direction
of the difference (X, - X 2) favors the I-test.
EXAMPLE 1O.6.1--Hansberry and Richardson (5) gave the percentages of wormy
apples on two groups of 12 trees each. Group A, sprayed with lead a.rsenate, had 19,.26,
22, J3, 26, 25, 38, 40, 36, 12, 16; and 8~<) of apples wormy. Those of group B, sprayed with
calcium arsenate and buffer materials, had 36, 42, 20, 43, 47, 49, 59, 37, 28, 49, 31, and 39%
wormy. Compute the mean square Within samples, 111.41, with 22 d.f.; and that Between
samples. 1650.04, with 1 df. Then,
F ~ 1650.04/111.41 ~ 14.8
Next, test the sjgnlfic&nce of the difference between the sample me&ns as in section 4.9. The
value of tis 3.85,.., JI4.8.
EXAMPLE IO.6.2-Forfl = l./l = 20, verify that the 5% and 1% significance levels
of F are the squares of those of I with 20 df.
EXAMPLE \O.6.3-?rove that the methods used in equations (lO.6.1) and (10.6.2) in
the text for finding the Between samples sum of squares. 9245.5, are equivalent.
EXAMPLE 1O.6.4-From equation (10.6.2) it follows that F = tl. For F
= (IXI - IX2)21.2n,~2, while t = (XI - X2)/J(2sl /n). Since Xl = IXd"ll, Xl = I.Xl/n, we
have t ~ (l:X, - tX,);J<2ns') ~ .jF.

10,7-Colllparisons among class means, The analysis of variance is


only the first step in studying the results. The next step is to examine the
class means and the sizes of differences among them.
Often, particularly in controlled experiments, the investigator plans
the experiment in order to estimate a limited numher of specific quantities.
For instance, in part of an experiment on sugar heet, the tilree treatments
(classes) were: (i) mineral fertilizers (PK) applied in April one week hefore
sowing, (ii) PK applied in December before winter ploughing, (iii) no
minerals. The mean yields of sugar in cwt. per acre were as follows:
. PK in April, X, = 68.8, PK in Decemher, X 2 = 66.8, NQ PK, X, = 62.4
The objective is to estimate two quantities:
Average effect of PK: t(X, + X2 ) - X, = 67.8 - 62.4 = 5.4 cwt.
April minus December application: X, _. X2 = 2.0 CWt.
269
A rule for finding standard errors and confidence limits of estimates
of this type will now be given. Both estimates are linear combinations of
the means, each mean being multiplied by a number. In the first estimate,
the numbers are 1/2, 1/2, -1. In the second, they are 1, -1,0, where we
put 0 because X, does not appear. Further, in each estimate, the sum
of the numbers is zero. Thus,
(1)+(1)+(-1)=0 : (I)+(-I)+(O)~O

Definition. Any linear combination,


L = ,11X1 + A,X, + .. ' + i.,X"
where the A's are fixed numbers, is called a comparison of the Ireatment
means if 1:A, = O. The comparison may include all a treatment means,
(k = a). or only some of the means (k < a).
Rule 10.7.1. The standard error of Lis .J1:i.'(u/..jn). and the esti-
mated standard error is .JLi.'(s/.Jn), with degrees .of freedept equal to
those in s. where n is the number of observations in each mean X j.
In the example the value of sl.J" was 1.37 with 24 df Hence. for
the average effect of PK, with Al = 1/2.,(, = 1/2.), = -I. the estimated
standard efror is
vro(j:C")"+-(:l-')'~+--'(-_--'I")'(I.37) ~ j1.5(I.37) = 1.68.

with 24 <if The value of r for testing the average effect of PK is'
r = 5.4;1.68 = 3.2, significant at the I~{ level. Confidence limits (95'%,)
are 5.4 ± (2.06)(1.68), or 1.9 and 8.9 ewl. per acre.
For the difference between the April and December applications. /
with AI. = I. ;" = -1. the estimated standard error is ,/2 (1.37) = 1.24:'
The difference is not significant at the 5'jo level. the confidence limits
being C.O ± (2.06)(1.94). or -2.0 and +6.0.
In view of the importance of Rule 10.7.1. we shall sketch the proof of
this result. Since the A, are fixed numbers. the population mean of L is
Ji.L = A1,uJ + ;'2112 + ... + AkPk.
where Jii is the population mean of Xj' Hence,
L - /1, = A1(X 1 - /1d + A,(X, - /1,) + '" + A,(X. - ",)
By definition, the variance of L is the average value of(L - J.iL)2 taken over
the popUlation. Now
, , ,
(L - ,'d'.= L Ai'(X, - /1,)' +2 L L !-,}fX, - /1,)(X, - I'j)
i"" 1 i '" I j> i

The average value of (X, - Iii)' over the popUlation is of course the
variance of Xi' The average value ~f (X, -=- /1,) (Xj - II;) is the quantit)
which we called the covariance of Xi and Xj (section 7.4. p. 181). This
gives the general formula.
270 Chapter 10: One-Way Cla"ilkOlion,. Analy,i, 01 Variance
k k k

VeL) = L i./ V(X,) + 2 L L A,).j cov (X,X) (10.7.1)


i= 1 i= 1 j;..i

When the ~ ~re the means of independent samples of size n. VeX,) = (J'(n,
and Cov. (X,X j ) = 0, giving
V(L) = (I;i})(J2(n
in agreement with Rule 10.7.1.
When reporting the results of a series of comparisons. it is important
to give the sizes of the differences, with accompanying standard errors or
confidence limits. For any comparison of broad interest, it is likely that
sc, era I experiments will be done, often by workers in different places.
The be'st information on this comparison is a combined summary of the
results of these experiments. In order to make this, an investigator needs
to know the sizes of the individual results and their standard errors. If
he is told merely that "the difference was not significant" or "the differ-
ence was significant at the I ~/~ level." he cannot begin to summarize effec-
tively.
For the example, a report might read as follows. "Application of
mineral fertilizers produced a significant average increase in sugar of 5.4
cwt. per acre (± 1.68). The yield of the April application exceeded that
of the December application by 2.0 cw!. (± 1.94), but this difference was
not significant."
Comments: (i) Unless this is already clear, the report should state the
the amounts of P and K that were applied; (ii) there is much to be said for
presenting, in addition, a table of the treatment (class) means, with their
standard error, ± 1.37. This allows the reader to judge whether the gen-
erallevel of yield was unusual in any way, and to make other comparisons
that interest him.
Further eXB,[llples of planned comparisons appear in the next two
chapters. Common cases are the comparison of a '"no minerals" treat-
ment with minerals applied in four different ways (section 11.3), the com-
parison of different levels of the same ingredient. usually at equal intervals,
where the purpose is to fil a curve that describes the relation between yield
and the amount of the ingredient (section /1.8), and factorial experimenw-
tion. which forms the subject of chapter 12.
I ncidcntaliy, when several different comparisons are being made. one
or two of the comparisons may show significant effects even if1he initial
F-test shows non-significance.
The rule that a comparison L is declared significant at the S'!:" level
if L/s,. exceeds 10 . 0 , is recommended for any comparisons that the experi-
ment was designed to make. Sometimes, in examining the treatment
means. we notice a combination which we did not intend to test but which
seems unexpectedly large. If we construct the corresponding L. use of the
Hest for testing Lis L is invalid, since we selected L for testing solely be-
cause it looked large.
271
Scheff" (11) has given a general method that provides a conservative
test in !his situation. Declare L/SL significant only if it exceeds
J(u - l)Fo.o" where Fo.o, is the 5% level of F for degrees of freedom
(, = (a - 1).!2 '= a(n - 1). In more complex experiments.!, is the num-
ber of error df. provided by the experiment. Scheff,,'s test agrees with
the Hest when a = 2, and requires a substantially higher value of LlsL
for significance when a > 2. It allows us to test any number of compari-
sons, picked out by inspection, with the protection that the probability
of finding any erroneous significant result is at most 0.05.
EXAMPLE 10.7.1--10 an experiment in which mangolds were grown on acid soil «(),
part of the treatments were: (i) chalk, (ii) lime, both applied at the rate of 21 cwt. calcium
oxide (CaO) per acre, and (iii) no liming. For good reasons, there were twice as many "no
lime" plots as plots with chalk or with lime. Consequently, the comparisons of interest may
be expressed algebraically as
Effect of CaO: yX\ + X2 ) - !(X) + X.d
where .\"3' X4 represent the two "no lime" classes.
Chalk minus. lime: X! - X 2'
The mean yields were (tons per a~re): chalk. 14.82;.lime. 13.42; no lime, 9.74. The
I".e. of any Xi was ± 2.06 tons, with 25 df Calculate the two comparisons and their standard
errors, and write a report on the results. Ans. Effect of CaO, 4.38 ± 2.06 tons. Chalk
minus lime. 1.40 ± 1,98 tons.
EXAMPLE 10.7.2--An experiment on sugar beet (7) compared time~ and methods of
applying mixed artificial fertilizers (NPf...·). The mean yields of sugar (cwt. per acre) were as
follows:

No Artificials applied in:


Artificials Jan. (Ploughed) Jan. (Broadcast) Apr. (Broadcast)

38.7 48.7 48.8 45.0


X, X, X, X,
--~~----'--~~~~~~--~~-----~-~~-----

Their s.c. was ± 1.22, with 14 d,f. Calculate 95% confidence limits for the following com-
parisons:
Average effect of artificials i(X 2 + X3 + X4 ) - XI
January minu, April application: i(X 2 + X J) - X 4
Broadcast minus Ploughed in J,,1.: XJ - X I

An,.: (i) (5.8. 11.8); (ii)lO.6. 7.0); liii) (- 3.6. + 1.8)";': per '«e.
EXAMPLE to.7 .3--0ne can encounter linear combinations of the means that are not
comparisons as we have defined them. but this seems to be rare. For in~tance, in early
experiments on vitamin 8 12 , rats were fed on a 8 12 -<ieficient diet until they ceased to gain in
weight. If we then compared a smgle and a double supplement of B l l ' measuring the subse-
quent gains in weight produced, it might be reasonable to calculate (X 2 - 2X I)' which should
be zero if the gain in weight is proportional to the amount of B l l . Here).1 + )'2 oF- O. The
formula for the standard error stiB holds. The s.c. is ..j5(J/Jii In this ex.ample.

IO.8~lnsJl«'tion of aU differences between pairs of means. Often,


the investigator has no specific comparisons, chosen in advance, that
he proposes to make. Instead, he looks at all the means to see which
272 Chapter 10: One-Way CI",.ilicafion•. Analysis of Variance
differences among them appear to he real. The most frequent example
is when the treatments are qualitatively similar, as in tests on working
gloves made by different manufacturers.
Taking the doughnut data from table 10.2.1 as an illustration, the
means for the four fats (arranged in increasing ordel) are as follows:

TABLE 10.8.1

Fat 4 I J 2 LSD D
Mean grams absorbed 62 72 76 85 12.1 16.2

The standard error of the difference between two means, ,/(2s' /n), is
±5.80, with 20 dj. (table 10.2.1). The 5~~ value of t with 20 dj. is 2.086.
Hence, the difference between a specific pair of means is significant at the
S~. level if it exceeds (2.086)(5.8) = 12.1.
The highest mean, 85 for fat 2. is significantly greater than the means
Ti for rat I and 62 for fat 4. The mean 76 for fat 3 is significantly greater
than the mean 62 for fat 4. None of the other three differences between
pairs reaclies 12.1. The quantity 12.1 which serves as a criterion is called
the Least Significant Difference (LSD). Similarly, 95~/. confidence limits
for the popUlation difference between any pair of means are given by
adding ± 12.1 to the observed difference.
Objections to indiscriminate use of the LSD in significance tests
have been raised for many years. Suppose that all the population means
1', are equal, so that there are no real differences. With five types of gloves,
for instance, there are ten possible comparisons between pairs of means.
The probability that at least one of the ten exceeds the LSD is bound to
be greater than 0.05: it can he shown to he about 0.29. With ten means
(45 comparisons among pairs) the probability of finding at least one sig-
nificant difference is about 0.63 and with 15 means it is around 0.83.
When the J.I; are all equal, the LSD method still has the basic property
of a test of significance, namely that about 5°~ of the tested differences
will erroneously be declared significant. The trouble is that when many
differences are tested, some that appear significant are almost certain to be
found. If these are the ones that are repocted and attract attention, the
test procedure loses its valuable property of protecting the investigator
against making erroneous claims.
Commenting on this issue. Fisher (8) wrote: "When the z test (i.e ..
the F-test) does not demonstrate significance. much caution should be
used before claiming significance for special comparisons." In line with
this remark, investigators are sometimes advised to use the LSD method
only if Fis significant. .
Among other proposed methods, perhaps the hest known tS one
which replaces the.!:SD bt_a criterion based on the tables of the ~tudent­
ized Range, Q = (X m" - Xmin)!SX' Table A IS gIves the upper 5% levels
273
of Q, i.e., the value exceeded in 5~'-; of experiments. This value depends
on the number of means, a, and the number f of dj. in Sr. Having read
QO.05 from table A 15, we compute the difference D between two means
that is required for 5% significance as Qo.o,sx.
For the doughilUts, a = 4, f = 20, we find Qo.o, = 3.96. Hence
D = Qo.o,S)1 = (3.96)(4.1) = 16.2. Looking back at table 10.8.1, only
the difference between fats 2 and 4. is significant with this criterion. When
there are only two means, the Q method becomes identical with the LSD
method. Otherwise Q requires a larger difference for significance than
the LSD.
The Q method has the property that if we test some or all of the
differences bqtween pairs of means, the probability that no erroneous
daim of significance will be made is ~0.9S. Similarly, the probability that
all the confidence intervals (X; - X;) ± D will correctly indude the differ-
ence 1'; - I'j is 0.95. The price paid for this increased protection is, of
course. that fewer differences 1'; - I'j that are real will be detected and
tha~ confidence intervals are wider.
EXAMPLE 10.&.1-1n Case,m of the constructed example in table 10.4.1. with,ul =:: 3,
)12 = 9, the observed means are Xl = 2.9, Xl == 4.{), X3 = 9.25. with s.l'. = "i(.~2In)
= 5, JlJ
= 0.75 t3 dJ.}. Test the three differences by (i) the LSD test, (ii) the Q test. Construct a
confidence interval for each difference by each method. (iii) Do at{ the confidence intervals.
include (Ill - Ilj)? Ans. (i, LSD -,.::. 3.37. X J significantly greater than Xl and X I' (ii) Re-
quired difference = 4.43. Same significant differences. (iii) Yes.
EXAMPLE I 0.8.2-ln example 10.5.1, the mean gains in weight of baby chicks under
four (eeding treatments were -"I
= 43.8, X 2 = 71.0. Xl = 81.4, X4 = 142.8 while ,,/(S2,'n)
= 12.0with 16df Compare the means by the LSD and the Qmethods. Ans. Both methods
show that X 4 differs significantly from any other mean. The LSD method gives XJ sig-
nificantly greater than XI'

Hartley (30) showed that a sequential variant of the Q method,


originally due to Newman (10) and Keuls (31), gives the'same type of
protection and is more powerful; that is, the variant will detect real dif-
ferences more frequently than the original Q method.
Arrange the means in ascending order. For the doughnut fats, these
means are as fOllows:

Fat J 2 " S.D.


I
62 76 85 ±4.10 (20 dfj

As before, first test the extreme difference, fat 2 - fat 4 = 23, against
D = 16.2. Since the difference exceeds D, proceed to test fat 2 - rat
1= 13 and rat 3 - rat 4 = 14 against the D value for a = 3, because these
comparisons are differences between the highest and lowest of a group of
three means. For a = 3.[ = 20, Q is 3.58, givingD = (3.58)(4.10) = 14.7.
Both the differences, 13 and 14, fall short of D. Consequently we stop;
the difference between fats 2 and 4 is the only significant difference in the
274 Chapter 10; One-Way Classifications. Analysis of Variance
experiment. If fat 3 - fat 4 had been. say. 17. we would have declared
this difference significant and next tested fat 3 - fat I and fat I - fat 4
against the D value for a = 2.
Whenever the highest and lowest of a group of means are found not
significantly different in this method, we declare that none of the members
of this group is distinguishable. This rule avoids logical contradictions in
the conclusions. The method is called Iequenlial because the testing fol-
lows a prescribed order or sequence.
Since protection against false claims of significance is obtained by
decreasing the ability to detect real differences, a realistic choice among
these methods requires a judgment about the relative seriousness of the
two kinds of mistake. Duncan (32) has examined the type of policy that
emerges if the investigator assigns relative costs to (i) declaring a signifi-
cant result when the true difference is zero, (ii) declaring non-significance
when there is a true difference, (iii) declaring a significant result in the
wrong direction. His policy is designed to minimize the average cost of
mistakes in such verdicts of significance or non-significance. These costs
are not necessarily monetary but might be in terms of utility or equity.
His optimum policy resembles an LSD rule with two notable differences.
In its simplest form, which applies when the number of treatments exceeds
15 and dj: in s exceed 30, a difference between two means is declared
significant ifit exceeds Svl" jF/(F - I). The quantity I (not Student's /)
depends on the relative costs assigned to wrong verdicts of significance or
non-significance. If Fis large, indicating that there are substantial differ-
ences among the population means of the treatments, .J F/(F - I) is
nearly I. The rule then resembles a simple LSD rule, but with the size
of the LSD determined by the relative costs. As F approaches I, suggest-
ing. that differences among treatment means are in general small, the
difference required for significance becomes steadily larger, leading to
greater caution in declaring differences significant. The F-value given by
the experiment enters into the rule because F provides information as to
whether real differences among treatment means are likely to be large or
small. In Duncan's method, the investigator may also build into the rule
his a priori judgment on this point.
In a large sampling experiment with four treatments, Balaam (33)
compared (i) the LSD method, (ii) the revised LSD method in which no
significant differences are declared unless Fis significant, (iii) the Newman-
Keuls method (as well as other methods). Various sets of values were
assigned to the population means 1'" including a set in which all /I, were
equal. For each pair of means, a test procedure received a score of + I
if it ranked them correctly, a score 0 if it declared a significant difference
when /Ii = Pi or found no difference when p, oF Pj, and a score - I if it
ranked the means in the wrong order. These scores were added over the
six pairs of means.
When all /I, were equal, the average scores were: LSD, 5.76: Re-
vised LSD, 5.91; NK, 5.94. With three means equal, so that three of the
six differences between pairs were equal and three unequal, average scores
175
were: LSD, 3.80; Revised LSD, 3.57; NK, 3.51. With more than three
inequalities between pairs, average scores were: LSD, 1.92; Revised
LSD, I. 73; N K, 1.63. To sum up for this section, no method is uniformly
best. In critical situations, try to judge the relative costs of the two kinds
ofinistakes and be guided by these costs. For routine purposes, thought-
ful use of either the LSD or the Newman-Keuls method should be satis-
factory. Remember also Scheff"'s test (p. 271) for a comparison that is
picked out just because it looks large.
IO.9-8bortcut computation using ranges. An easy method of testing
all comparisons among means is based on the ranges of the samples (13).
In the doughnut experiment, table 10.2.1, the four ranges are 39. 20, 30, 21 ;
the sum is 110. This sum of ranges is multiplied by a factor taken from
table 10.9.1. In the column for a = 4 and the row for n = 6. take the fac-
tor 0.95. Then
D' = (Fac\or)(Sum of Ranges) = (0.95)(!IO) = 174
n 6'
D' is used like the D in the Q-test of the foregoing section. Comparing it
with the six differences among treatments, we conclude, as before, that
only the largest difference, 23, is significant.
TABLE 10.9. I
CltmcAL FAClOJt,5 FOR. AU.I)WANCES, 5% R15K'*

Sample Number of Samples. Q

Site.
n 2 3 4 5 6 1 3 9 JQ

2 3.43 2.35 1.74 1.J9 US 0.99 0.37 0.77 0.70


3 1.90 1.44 1.14 .94 .80 .10 .62 .56 .51
4 1.6Z 1.25 LOI .84 .72 .63 .57 .51 .47
5 J.S) 1.19 .96 .81 .70 .61 .55 .SO .45

6 1.5Q 1.17 .95 .80 .69 .61 .55 .49 .45


7 1.49 1.17 .95 .80 .69 .61 .55 .SO .45
8 1.49 1.18 .96 .81 .70 .62 .55 .SO ,46
9 1.50 J.19 .97 .82 .71 .62 .56 .51 .47
10 1.52 1.20 .98 .83 .72 063, .57 .52 .4 7
• Extracted from a more extensive table by Kurtz. Link. Tukey. and Wallace (Il).

EXAMPLE 1O.9.I-Using the shortcut method. examine all differences in the chick
experiment of e",mple 10,5.1 (p. 167). ADS. D' = 49. Same conclusions as for the Q
method in example )0,8.2.

lO.IO-Modell. Fixed Ireatm""t elfects. It is time to make a more


formal stalement about lhe assumptions underlying the analysis of vari-
ance for single classifications. A notation common in statistical papers
is to use the subscript i to denote the class, where i takes on the values
I. 2.... a. The subscript j designates theimembers of a class, j going
from I to II.
276 Chopter '0: One-Way Classifications. Analysis af Varianc.
Within class i. the observations X'j are assumed normally distributed
about a mean J.li with variance (12, The mean J.li may vary from class to
class. but ,,' is assumed the same in all classes. We denote the mean of
the a values pf 1', by 1', and write 1', = I' + <x,. It follows, of course, that
I:<x, = O. Mathematically, the model may be written:
Xii = J.l + a + Gij;
j i = I ... a, j = 1 ... n, Hi)' = %(0,0).
In words:
Any obser.ed value is the sum of three parts: (i) an overall mean, (il)
a treatment or class deviation, and (Iii) a random element from a normally
distributed population with mean zero and standard deviation a,
The artificial data in table lOA. I were made up according to this
model. In Case II, with 1', = 4, 5, 7. we have I' = 16/3, <x, = - 4/3.
'" = - 1/3, <x, = + 5/3. The 'if were drawn from a table of normal de-
viates with" = I.
This model is often called model I, the fixed effects model. Its dis-
tinctive feature is that the effects of the treatments or classes, measured
by the parameters" " are regarded as fixed but unknown quantities to be
estimated.
lO.ll-Effects of error.; in the _umptions. For the user of the analy-
sis of variance, two relevant questions are: (i) Are the assumptions satis-
fied in my data? (ii) Does it make any difference if they are not satisfied?
Real data are seldom, if ever, exactly normally distributed. Often
they exhibit some skewness; if symmetrical, they may have longer tails
than the normal distribution. Three situations in which one should be
on the lookout for non-normality are: (i) with small whole numbers,
whose distribution may approximate the Poisson rather than the normal,
(ii) with proportions or percentages that cover a range extending nearly
to zero or 100%, and (iii) cases in which the treatments (or classes) pro-
duce multiplicative effects. Model I assumes that the effect of the ith
class is to alld <x, to any existing value. If, instead, the effect is to multiply
the existing value by, say, 60%, the observations are likely to approximate
a distribution called the lognormal. This is a skew distribution of values
X such that log X is normally distributed.
In a single classification with equal n, various mathematical studies
agree in showing that the F-test is little affected by moderate non-normal-
ity. However, with non-normal data, the variance a;' within a class is
often related to the mean 1', of the class. For the Poisson distribution,
you may recall that";' = 1',. With a proportion, the variance may be-
have like I'll - 1',), and with the lognormal distribution, ".' tends to
vary as 1'/. It follows that with non-normal data, the use of a pooled
estimate of error S2 in comparin~ pairs or subgroups of means can be
seriously misleading. With two treatments A and B that produce small
means, ,,' might be about 20, while with C and D, which give large means,
a' is about 60. The pooled s' will be about 40. For comparing A with
D, the pooled s' gives a I-value that is too small by a factor -./2 = 1.41,
lUT

while for comparing C with D. 1 is toO large by a factor J3/2 . . Heteroge-


neous variance also occurs occasionally because some treatments by their
nature produce erratic effects-sometimes they work well, sometimes not.
Here there may be no clear relatIon between (1/ and jJj.
When comparing two classes. a safe rule is to calculate S2 from the
data for these two classes only. The disadvantage is that the number of
dj. is reduced (see also section 4.14). With a single erratic treatment
(the ith). a pooled S2 can be calculated and used for comparisons amo<:,g
the remaining treatments, and a separate s/ for the erratic one. The s.e.
of (X, - X) is estimated as
J(s/ + s2)/n
When the relation between a/ and 1', is caused by non-normality.
a knowledge of the type of data. plus a look at the relation between X,
and R, (the range within the class) helps in deciding whether the data are of
the Poisson type (R, ex J X,), the quasi-binomial type (R, 0'. .jX;iT- X,).
orthe lognermal type, R,:x X,. For these three types. transformations will
be given later (sections 11.l4-11.17) that bring the data closerto normality
and often permit the use of a pooled error variance for all comparisons-

IO.12-Samples of unequal sizes. In planned experiments, the sam-


ples from the classes are usually made of equal sizes, but in non-experi-
mental studies the investigator may have little control over the sizes of
the samples. As before, X'j denotes the jth observation from the ith
class. The symbol X,. denotes the class total of the x,j, while X .. = !:X,.
n,.
is the grand total. The size of the sample in the ith class is and N = tn,
is the total size of all samples. The correction for the mean is

c = X .. 2/N
Algebraic instructions for the dj. and sums of squares in the analysis
of variance appear in table 10.12.1.
TABLE 10.12.1 ,
ANAL YSlS Of V .... RI"'NCE WITH SAMPLES Of UNEQUAL.s,ZES

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Between classes a-I --c


L'r,.' s,'
"
Within classes

Total
N- a

N-I
,
1:1:%1/ - c
' LX,.'
Subtract = l:I:XI} - --
., r

The Fratio, s//s', has (a - I) and (N - oj df The s.e. of the dif-


ference between the ith and the kth class means, with (N - a) df, is
218 Chapto, 10: Ono-Way ClauiAcations. Analysis of Voriance

The s.e. of the comparison I;i.;X; is

With unequal n;, the F- and t-tests are more affected by non-normality
and heterogeneity of variances than with equal n; (14). Bear this in mind
when starting to analyze the data.
EXAMPLE 1O.12.I-Tbe numbe:rs of days survived by mice inoculated with three
strains of typhoid organisms are sumnlarized in the following frequency distributions. Thus.
wllhstrain9D,6micesurvlvtdfor2d;l),s,etc. Wehaven J = 31, I'll = 60.1'1) = 133. N = 224.
The purpose of the analysis is to estimate and compare the mean numbers of days to death
for the three strains.
Since the \'ariance fOT strain 9D looks much smaller than that for the other strains, it
seems wise to calculate s,I separately for each strain, rather than use a pooled S2 from the
analysis of variance. The calculations are given under the table.

Numbers of Mice Inoculated


With Indicated Strain
Days to Death 9D IIC DSCI Total

2 6 I 3 10
1 4 3 5 12
4 9 3 5 17
5 8 6 8 22
6 3 6 19 28
7 I 14 23 38
8 II 22 33
9 4 14 18
10 6 14 20
lJ 2 7 9
12 3 8 11
13 I 4 5
14 I I

Total 31 60 133 224

LX 125 442 1.037 1.604


tx' 561 3.602 8,961 13,124

31 60 III 224
"X,. 125 442 1,037 1.604
Xi· 4.03 7.37 7.80
I:X(/ 561 3,602 8,961 13,124
X,.2jn; 504 3,256 8,085

I:(X1j _ X1·)l 57 346 876


l
Si 1.90 5.86 6.64
219
The difference In mean days to death for strains lie and 9D is 3.34 days. with

11.90 5.86}
S.t'. =
Jf31 + 60 = ,/0,1591 = ±O.399.

for strains DSCI and lie the difference is 0.43 days ± 0.384.
EXAMPLE 10.12.2 As an exercise. calculate the analysis of variance for the preceding
data. Show that F = 179.5 5.79 = J 1.0, /= 2 and 221. Show that if the 'pooled .1'2 were
used. the J.e. of the mean difference between strains II C and 9D wlluld be estimated as
±O.532 instead of _!:O.J91J.

IO.I3-Model II. Random effects. With some types of Single classi-


fication data. the model used and the objectives of the analysis differ from
those under model I. Suppose that we wish to determine the a\'erage
content of some chemical in a large population or batch of leaves. We
select a random sample of a leaves from the population. For each
selected leaf. n independent determinations of the chemical content are
made giving IV' = an observations in all. The leaves are the classc:-.. and
the individual determinations are the members of a clas~.
In model II, the chemical content found for the jth determination
from the ith leaf is written as

Xlj = J1 + Ai + Cij • i = .. , (.1, j = I ... n. (10.13.11


where

The symbol !1 is the mean chemical content of the population of


leaves. This is the quantity to be estimated. The symbol A; represents
the difference between the chemical content of the ith leaf and the average
content· over the popUlation. By including this term. we take account of
the fact that the content varies from leaf to leaf. Every lea fin tbe popula-
tion has its value of Ai' so that we may think of Ai as a random vaTiable
with a distribution over the population. This distribution has mean O.
since the A i are defined as deviations from the population mean. In the
simplest version of model II. it j:-, assumed in addition that the Ai are
normally distributed with standard deviation GA' H~c,e. we have writ-
ten A; = .#"(0, "A).
What about the term [; ij? This term is needed because',
(i) the determination is subject to an error of mca:-,urement. and
(ii) if the determination is made on a small piece of the leaf. its con~
tent may differ from that of the leaf as a whole. The f.ij and the A; arc
assumed independent. The further assumptIOn cij = ;V(O, (1) IS often
made.
There are some similarities and some differencl?s netWL'L'n model II
and model I. In model I
~; fixed,
280 Chapter '0: One-Way Cla ..iRccr/ioft•. Analysis 01 Variance
Note tho following points:
(i) The "" are fixed quantities to be estimated; the A, are random
variables. As wiJI be seen. their variance 0'.,/ is often of interest.
(ii) The null hypothesis "" = 0 is identical with the null hypothesis
crA = 0, since in this event all the A, must be zero. Thus. the Ftest holds
also in model II, being now a test of the null hypothesis U A = O.
(iii) We saw (in section 10.4). that when the null hypothesis is false,
the mean square between classes under model! is an unbiased estimate of
E(M.S. Between) = 11' + n"E.",,'/(a - I) (10.13.2)
There is an analogous result for model II, the mean square estimating
E(M.S. Between) = 11' + nl1A' (10.13.3)
Neither result requires the assumption of normality.
(iv) In drawing repeated samples under model I, we always draw
from the same set of classes with the same "',. Under model II, we draw a
new random sample of a leaves. A consequence is that the general dis-
tributions of F (when the H, is false) differ. With model I, this distribu-
tion, the power function, is complicated: tables by Tang (J 5) and charts
by Pearson and Hartley (16) are available. With model II, the prob-
ability that the observed variance ratio exceeds any value Fo is simply
the probability that the ordinary F exceeds FolO + nl1.' la'). .
To turn to an example of model II, the data for calcium in table
10.13.1 Come from a large experiment (17) on the precision of estimation
of the chemical content of turnip greens. To keep the example small,
we have used only the data for n = 4 determinations on each of a = 4
leaves. In the analysis of varia nee (shown below table 10.13.1), the mean
square between leaves SL 2 is an unbiased estimate of (12 + nO'./ = (12
+ 40 A 2. Consequently, an unbiased estimate of (J A 2 is
SA' = (SL' - s')/4 = (0.2961- 0.0066)/4 = 0.0724
The quantity 11.' is called the component of variance for leaves. The
value of F = 0.2961/0.0066 = 44.9 (highly significant with 3 and 12 dJ.)
is an estimate of {cr' + 4a A ')/G' .
We now consider the questions: (i) How precisely has the mean
calcium conlent been estimated? (ii) Can we estimate it more economical-
ly? With n determinations from each of a leaves, the sample mean X ..
is, from equation 10.13.1 for model II,
X .. = I' + A. + iL ,
where A. is the mean of a independent values of A, (one for each leaf),
and E.. is the mean of an independent £ij' Hence the variance of X.. as
an estimate of Ii is
q ~ q2 q2 + nu 2
V(X .. ) = ~ + _ = A (10.13.4)
a an an
281
TABLE 10.13.1
C"LCfUM CO....CfNTRA nON IN TURNiP GREENS
(per cent of dry wc)ght)

Leaf Per Cent of Calcium Sum Mean

I 3.28 3.09 3.03 3.03 12.43 3.11


2 3.52 348 3.38 3.38 13.76 3.44
3 2.88 2.80 2.81 2.76 11.25 1.81
4 3.34 3.38 3.23 3.26 13.21 3.30

Source of Variation Degrees of F rc;edom Mean Square Parameters Estimated

Between Jeave~
Determination\>

52 = 0.0066 estimates 1J2,


3
12

s/ =
0.2961
0.0066
.'
(11

(0.2961 - O.{)()66)/4 = 0.0724 estimates


+ 4c"/

(fA
J

In the analysis of variance, the mean square between leaves, 0,2961.


is an unbiased estimate of (0'2 + 40'/), Hence, vO.' .. ) = (0,2961)/16
= 0,0185. This is an important result, The estimated variance of the
sample mean is the Between classes mean square, didded by the tofal
numher of obserrations.
Suppose that the experiment is to be redesigned, changing n and a to
n' and a',As in equation 10,13.4, the variance of X .. become;
, _ 0,0724 0,0066
V(X .. ) = (1/ (12
~- + -~- _ -~- + ---,
a' a'n' a' a'n'

where the -. sign means "is estimated by. ,. Since the larger numerator
is'0,0724, it seems clear that a' should be increased and n' decreased if this
is possible wilhout increasing the total cost of the expenmenL If a de-
termination costs to times as much as a leaf, the choice of 11 = I and
0: = 1Swill cos.t about the ume a' 0'" o'(g("a\ data. f '" tl\(, "e'l<
design our estimate ofthe variance of X .. is

ti'(X,.) = 0.072~ + 0.0066 = 0.0053-


15 lS
The change reduces the variance of the mean from 0.0185 to 0.005:;, 1.< ..
to less than one-third. This is because the costly determinations wnh
small variability have been utiliz.ed to sample more leaves whose variation
is large, A formula for determining the best values of a' and n' in a given
cost situation will be found in sections 17, II and 17.12,
With model II, the difference I Xi, - 11) between a single observation
and the popUlation mean is the sum of the two terms Ai and t,J' Hence.
the variance of Xjj is (a ,/ + 0'2). The two parts are cal1ed the compolli!nts
o_(t1Qriam'e. The previous example illustrates how these components are
used in problems of measurement the objective being \0 estimate .u as
2'2 Chopte, 10: O.g·Way C/assificatioll •. AlIa/ysi. of Varia.,..
economically as possible. In plant breeding. n replications of each of a
inbred lines may be grown in an experiment. The component 0'.'
represents differences in yield that are due to differences in the genotypes
(genetic characteristics) of the inbreds. while a' measures the effect of
non-genetic influences on yield. The ratio a.'I(O'/ + a') of genetic to
total variance gives a guide to the possibility of improving yield by selec-
tion of particular inbreds. The same concepts are important in human
family studies, both in genetics and the social sciences, where the ratio
0'/1(<1.' + a') now measures the proportion of the total variance that
is associated with the family. The interpretation is more complex, how-
ever, since human families differ not only in genetic traits but also in
environmental factors that affect the variables under study.
EXAMPLE 10.13.1-The foUowingdata were abstracted from records of performance
of Poland China swine in a single inbred line at the Iowa Agricultural Experiment Station.
Two boars were taken frotn each of four litters with common sire and fed a standard ration
from weaning to about 225 pounds. Here are the average daily gains:

Utter 2 J 4

Gains 1.18 1.36 1.37 1.07


1.11 1.65 1.40 0.90

2
Assuming rhar Ihe litter variable is normally distributetl. show Ihal til( differs significantly
from zero (F = 7.41) and that 0.0414 estimates it.
EXAMPLE to.t3.2-Tbcre is evtdence tbat persons estimating the crop yields of6elds
by eye tend to underestimate high yields and overestimate loW yields, If so, and if two
estimators make separate estimates of the yidds of each of a number of fields. what will be
the effect on: (i) the model II assumptions, til} the estimate.r/ of the variance q/ between
fields. (iii) the eSlimate Sl of rrl';
EXAMPLE IO.13.3~ To prove the result (10.13.3) for tbe eXpCcted value ofthc mean
square between classes, show that under modd II,

(x" - X,,) ~ lA, - A) + (' •. - ,,,)


'-
L~'- "IX",.:_-__::Xcc"):_' = ilA, - A)' + ii'"~ - i..)' 2i(A, - A)(i,. - i..)
+~"-''-;--'':7------'
(a-I) la-I) (a-I) fo-tl

where Xj' is the mean of the tI determinations in class i. and X.. is the overall sample mean,
If a random sample of leaves has been drawn. the first term on the right is an unbiased
estimate of rJ A ~,and the second of rJl/tI. since Gj. is the mean of tI independent determinations.
The third term vanishes, on the average in repeated sampling, if the Ai and eli are inde~
pendent Multiplying by n to obtain the mean square between classes. tbe result follow':>.
See if you can obtain the corresponding result (10.13.2) foe model I,

1O.14--Structure of model II illustrated by sampling, It is easy to


construct a model II experiment by sampling from known populations.
One population can be chosen to represent the individuals with variance
(12 and another to repres'ent the variable class effects with variance G ,,1 ~

then samples can be drawn from each and combined in any desired
283
TABLE 10.14.1
GAINS IN WEIGHT Of ::!O PIGS IN T[N LITTERS OF Two PIGS EACH
(Each gain is the sum of three components. The component for litters is a sample
with a/ ""- 25. that for individuals IS from table 3.2.1 with (Jl = 1(0)

Litter P'g Sample of


Litter Component Component Pig Gains Sample of
Number A, Lj) X,} = Jl + .4; + t'j Litter Gains

(I) (2) ill (4) ~ ]0 + (l) + (3) (5)

- I 7 36
9 ]8 74

2 2 - 4 28
-23 9 37

3 - I 0 29
19 48 n
4 0 2 32
2 32 64

5 - 4 3 29
12 38 ~7

6 -10 9 29
3 23 52

7 10 5 45
- 4 36 81

8 2 -19 13
-10 22 35

9 4 - 4 30
18 52 82

10 - 2 15 43
- ~ 22 65

SOlJn:e of Variation Degrees of Freedom Mean SqUlolr P Parameters Estimated


-----+---
tillers 9 144.6
Individuals 10 96.5

s' = 96.5 estimates 100. \/ = 044.6 - 96.5)/2 = 24.0 es.timates 25

proportion. In table 10.14.1 is such a drawing. The sample consists


of two pigs from each of ten litters, the litters simulating random class
efl'ecb. Individual pig gain> were taken from table 3.2. I with (12 = 100.
two of these per litter. The litter component~ were drawn from a popula-
tIOn with (1/ = 25 (table 3.10.1 in the fifth edition of thIS book)
284 Chapter 10: Ona-Way Classifications. Analysis 01 Variance
The usual analysis of variance is computed from table 10.14.1, then
the components of variance are separated. From the 20 observations we
obtained estimates S2 = 96.5 of q2 = 100 and s/ = 24.0 of q.' = 25,
the two components that were put into the data.
This example was chosen because of its accurate estimates. An idea
of ordinary variation can be got from examination of the records of 25
similar samples in table 10.14.2. One is struck immediately by the great
variability in the estimates of qA2, some of them being negative! These
latter merely indicate that the mean square for litters is less than that for
individuals; the litters vary less than random samples ordinarily do if
drawn from a single, normal population. Clearly, one cannot hope for
accurate estimates of q2 and q A 2 from such small samples.

TABLE 10.14.2
EsTIMATES OF .,.A 1 = 25 ArID_.,.l =
100 MADE FROM 25 SAMPLES ORA WN LIKE
THAT OF TABLE 10.14.1

Sample Estimate of Estimate of Sample Estimate of Estimate of


Number qA
1
= 25 (/1 = 100 Number u/ = 25 17 = 100
1

I 60 127 14 56· 112


2 56 104 15 -11 159
1 28 97 16 67 54
4 6 91 17 -18 90
5 18 60 18 11 65
6 - 5 91 19 -21 127
7 7 53 20 -48 126
8 - I 87 21 4 43
9 0 66 22 1 145
10 -78 210 23 49 142
II 14 148 24 75 21
12 7 162 25 77 106
11 68 76
Mean 17.0 102.6

EXAMPLE 10.14.I~In table 10.14.2, how many negative estimates of U.(2 would be
expected? Ans. A negative estimate occurs whenever the observed F < l. From section
10.13, the probability that the observed F < t is the probability that the ordinary
F < 1/(1 + 20"//0"1), or in this example, < 1/1.5 = 2/3, where Fhas9 and IOdj. A property
of the F distribution is that this probability is the probability that F, with 10 and 9 dr,
exceeds 3/2. or 1.5. From table A 14. with!! = IO,./l = 9, we see that Fexceeds 1.59 with
P = 0.25. Thus about (0.25)(25) = 6.2 'negative estimates are expected. as against 7 found in
table 10.14.2.
IO.IS-Confidence limits for q/. Assuming normality, approxi-
mate confidence limits for q.' have been given by Moriguti (18). We
shall illustrate from the turnip greens example (table 10.13.1) for which
n = 4'/1 = 3'/2 = 12, SA 2 = 0.0724, and S2 = 0.0066. It is necessary to
look up four entries in the F-table. If the table of 5% significance levels
is used, these determine a two-tailed 90% confidence interval, with 5% on
each tail. The 5% values of Fneeded are as follows:
285
F, = FJ,.h = F,." =
3.49
F2 == FI.,Xl = F3,,..;
:::::: 2.60
F, = Ff,,f, = F12 ., = 8.74
F4, = F~.fl = FCI';).3 = 8.53
F = observed value of F = 44.9
The limits are given as multipliers of the quantity s'/n = (0.0066)/4
= 0.00165. The lower limit for 0'.' is
• , (F -F,)(F+F,-F,) s' (44.9-3.49)(44.9+3.49-2.60)
O'AL = FF, n= (44.9)(2.60) (0.00165)
(41.41)(45.79)
= (44.9)(2.60) (0.00165) = 0.027

As would be expected. the lower limit becomes zero if F = F,; that is, if
F is just significant at the 5% level.
The upper limit is

IJ 2 = {FF _ 1 (F, - F 4)}S2


.tu 4 + FF3 1 n
= {(44.9)(8.53) - I + (0.21)/(44.9)(8.74)'}(0.00165) = 0.63
Frequently, as in this example, the rather unwieldy second term inside the
curly bracket is negligible and need not be computed.
To summarize, the estimate is s1 = 0.0724, with 90% confidence
limits 0.027 and 0.63. Earlier, Bross (19) gave approximate fiducial limits,
using the same five values of F. His limits agree closely with the above
limits whenever F is significant.
If the distributions of Ai and ' i j are non-normal, having positive
kurtosis, the variance of 51 is increased, and the above confidence inter-
vals are too narrow.
EXAMPLE 10.IS.l-ln estimating the amount of plankton in an area of sea, seven
runs (called hauls) were made, with six nets on each run (20). Estimate the component of
variance between hauls and its 90% confidence limits.

Degrees of Freedom Mean Square

Between hauls 6 0.1011


Within hauls 3S 0.0208

Ans. $,/ = 0.01>4, with limits (0.0044, 0.053).

IO.16-Samples within samples. Nested classilkations. Each sample


may be composed of sub-sampl~s and these in turn may be sub-sampled.
etc. The repeated sampling and sub-sampling gives rise to nested or
hierarchal classifications. as they are sometimes called.
286 Chopter 10: One~Way Classifications, Analys;s of Variance
In table 10.16.1 is an example. This is a part of the turnip greens
experiment cited earlier (17). The four plants were taken at random, then
three leaves were randomly selected from each plant. From each leaf
were taken two samples of 100 mg. in which calcium was determined by
microchemical methods. The immediate objective is to separate the
sums of squares due to the sources of variation, plants, leaves of the
same plant, and determinations on the leaves.
The calculations are given under table 10.16.1. The total sums of
squares for determinations, leaves, and plants are first obtained by the
usual formulas. The sum of squares between leaves oj Ihe same plant is
found by subtracting the sum of squares between plants from that be-
tween leaves. as shown. Similarly, the sum of squares between delermina-

TABLE 10.16.1
CALCIUM CONCENTRATION (PER CENT. DRY BASIS) IN b = 3 LEAVES FROM EACH OF
=
a = 4 TUR.NIP PLANTS. n 2 DETERMINATIONS PER Lt:AF. ANALYSIS OF VAR.IANCE

Plant. i Leaf. ij
;= 1 ... 0 j=I ... b Determinations. X ij1 X'J' Xi" X ...

1 1 3.28 3.09 6.37


2 3.52 3.48 7.00
3 2.88 2.80 5.68 19.05

2 I 2.46 2.44 4.90


2 1.87 1.92 3.79
3 2.19 2.19 4.38 13.07

3 I 2.77 2.66 5.43


2 3.74 3.44 7.18
3 2.55 2.55 5.10 17.71

4 I 3.78 3.87 7.65


2 4.07 4.12 8.19
3",," 3.31 3.31 6.62 22.46 72.29

Total Size = ahn = (4)(3)(2) == 24 determinations

c= (X ... )2/abn = (72.29)2/24 _ 217.7435


Determinations: IX, j1 2 - C = 3.28 2 + ... + 3.311 - C = 10.2704
Leaves: l:XiJ"2/n - C = (6.37 2 + .. + 6.622)/2 - (" = 10.1905
Plants: r.X, .. l/hn - C = (19.05 2 + . . + 22.46 2 )/6 - C = 75603
Leaves of the same plant = Leaves - Plants = 10.)905 - 7.5603 = 2.6302
Determinations on same leaf"", Determinations - Leaves = 10.2704 - 10.1905 = 0.0799

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Plants 3 7.5603 2.5201


Leaves in plants 8 2.6302 0.3288
Determinations in leaves 12 0.0799 0.0061

Total 23 10.2704
287
lions on the sall'e leaf is
obtained by deducting the total sum of squares
between leaves from that between determinations. This process can be
repeated with successive sub~sampling.
The model being used is,
X'j' = I' + A, + B'j + '"" i = I ... a. j = I ... b, k = I ... n,
A, =. nO, <7 A ), B" = %(0.<7 8),0", =. no, (7), (10.16.1)
where A refers to plants and B to leaves. The variables Ai; Bjj , and f;;j/c
are all assumed independent. Roman letters are used to denote plants
and leaves because they are random variables, not constants.
TABLE 10.16.2
COMPLETED ANALYStS Of VARIANCE OF TVRNlP GREENS DATA
--,-
Source of Variation Degrees of Freedom Mean Square Parameters Estimated
I
Plants 3 2.5201 (12 + nUB + hna/
Leaves in plants 8 0.3288 q2+nO'sl
Determinations in leaves 12 0.0067 u'
n=2. h=.3. 0=4 . .~1=O.0067 estimates aI, 5/'=(0.3288-0.0067)/2=0.1610 estimates
(Ja~. s/ = (2.5201 - 0.3288)/6 = 0.3652 estimates 0'./
In the completed analysis of variance, table 10.16.2. the components of
variance are shown. Each component in a sub-sample is included among
those in the sample above it. The estimates are calculated as indicated.
Null hypotheses which may be tested are:
25201 . ; " + 11<7.' + nll<7 '
I.<7A ' =0; F= ---= 7.66 estImates - - - ~--"'--, f=3,8.
0.3288 (/2 + nu.' -
-2. <7/ = 0; 0.3288 . <7' + ntJ.'
F = 0.0067 = 49 estImates ,,' 'f = 8, 12.

For the first, with degrees of freedom, f, = 3 arid f, = 8. F is almost on


its I~';' point, 7.59; for the second, with degrees of freedom 8 and 12. F is
far beyond its I~~ point, 4.50. Evidently, in the sampled population the
per cent calcium varies both from leaf to leaf and from plant to plant.
As with a single sub-classification (plants and leaves in section 10.13),
it may be shown that the estimated variance of the sample mean per
determination is given by the mean square between plants, divided by the
number of determinations. This estimated variance can be expressed in
terms of the estimated components of variance from table 10.16.2, as
follows:
.,.' = 2_}~OI = 0.105 = ~J~l6 7 + 11(0.1610) +hll(0_:?652)
24 nah
0.0067 0.1610 0.3652
= -.--- + --- +
nah ah a
288 Chopf.r JO: One-Way Clouilication .. Analysis 01 Variance
This suggests that more information per dollar may be got by decreasing
n, the number of expensive determinations per leaf which have a small
component, then increasing b or a, the numbers ofleaves or plants, Plants
presumably cost more than leaves, but the component is also larger. How
to balance these elements is the topic of section 17,12,
Confidence limits for <1/ and <1 s' are calculated by the method
described in section 10,15,
EXAMPLE IO.16.I~Verify that· the sum of squares for Determinations in leaves. as
found by subtraction in table 10 16.1, is the sum of squares of deviations of the determina·
tions from their respective leaf means; Ans. Since the C term cancels, Determinations
- Leaves is equal to
L L LX".' - L L X,j.'/n ~ L L L (X". - X'j')'
; j It. , J ' ) It.

by the usual shortcut rule for finding a sum of squares of deviations, where X;)' is the mea,J'1oof
the n determinations on the jth leaf of the ith plant.
EXAMPLE 10.16.2~From,equation 10.16.1 for the model. show that the variance of
the sample mean is (a- l + MB2 + bnt! /)/abn. and that an unbiased estimate of it is given by
the mean square between plants, divided by abn, i.e., by 2.5201/24 = 0.105, as_ stated in
section 10.16.
.-~-------
EXAMPLE 10.16.3-lf one determination were made OD. each of two leaves from each
of ten plants, what is your estimate of the variance of the sample mean? Ans. 0.045.
EXAMPLE 10.16.4~With one determination on one leaf tram each plant, how many
plants must be taken in order to reduce sr to 0.21 Ans. About 14. (This estimate is very
rough, since the mean square between plants has only 3 d!),
IO,17-8ampies witltin samples. Mixed model, In some applications
of sub-sampling, the major classes l>ave fixed effects that are to be esti-
mated. An instance is an evaluation of the breeding, value ofa set of five
sires in pig-raising. Each sire is mated to a random group of dams, each
mating producing a litter of pigs whose characteristics are the criterion.
The model is:
X'j' =!" T~, + B'j + "j' (10.17.1)
The ~, are constants (I:~, = 0) associated with the sires but the B'j and
the e'j' are random variables corresponding to dams and offspring. Hence
the model is called mixed.
Table 10.17.1 is an example with b = 2 dams for each sire and n = 2
pigs chosen from each litter for easy analysis (from records of the Iowa
Agricultural Experiment Station). The calculations proceed exactly as in
the preceding section. The only change is that in the mean square for
sires, the term nbK', where K' = I:~'/(a - I). replaces IIb<1/.
In a mixed model of this type, two points must be noted. From equa-
tion 10.17.1, the observed class mean may be written
Xi" = J.l + /Xi + B + ii"
j•

where H,. is the average of b values of the Bij and ', .. is the average of nb
values of the BUk.' Thus the variance of Xi'.' considered as an estimate of
It + il: j• is
289
TABLE 10.17.1
AVERAGE DAILY GAIN OF Two PIGS OF EACH LITTER

Sire Dam Pig Gains Sums

1 I 2.77 1.38 5.15


2 2.58 2.94 5.52 10.67

2 I 2.28 2.22 4.50


2 3.01 2.61 5.62 10.12

3 1 2.36 2.71 5.07


2 2.72 2.74 5.46 10.53

4 I 2.87 2046 5.33


2 2.31 2.24 4.55 9.88

5 t 2.74 2.56 5.30


2 2.50 2.48 4.98 10.28 51.48
-
Source of Variation Degrees of Freedom Mean Square Parameters Estimated

Sires 4 0.0249 ti" + ntJ,/ + nbK2


Dams-Same Sire 5 0.1127 (/1 + M.l
Pairs-Same Dam 10 0.0387 a'
n::;; 2, b = 2, J'2 = 0.0387 estimates 11 2 , s/ := {O.1l27 - 0.0387)/2 = 0.0310 estimates (/.2,
o estimates 1(2

To test (1.1 = 0, F ==: 0.1127;0.0387,., 2.91. FOM "'" 3.33.

2
(J B (12 1 2 2
vrXi ··) = -
h
+ »-b = -b'
n
(a + nl18 )

The analysis of variance shows that the mean square between dams oj
the same sire is the relevant mean square. being an unbiased estimate of
(a' + neT.'). The standard error of a sire mean is ~(0.1127/4) = 0.168,
with 5 df. Secondly. the F ratio for testing the null hypothesis that all
a;are zero is the ratio 0.0249/0.1127. Since this ratio is substantially less
than I, there is no indication of differences between sires in these data.
IO.IS-8amples of unequal sizes. Random effects. This case occurs
commonly in family studies in human and animal genetics and in the
social sciences. The model being used is a form of model II:
X;j=!, + A; +.E;i' i= I, ... a.} = I, ... n" A; =.},"(O.I1.), E,j=.'((O.I1}
The new feature is that "" the size of sample of the ith class. varies from
class to class. The total sample size is N = :En;. All A, and .'i are assumed
independent.
The computations for the analysis of variance and the F-test of the
null hypothesis 11 • 7' 0 are the same as for fixed effects. as given in section
290 Chap'er 10: One-Way Cla ..i/lcafions. Analysis of VariClJlce
10.12. With equal n i( = n). the mean square between classes was found to
be an unbiased estimate of (1' + n(1/ (section 10.13). With unequal ni ,
the corresponding expression is {12 + noO' A 2 • where
J
no = - _ N - - - '
( :r.n.,) =;; - :r.(n - ;;)'1(0 - J)N
(0 - 1) N '
The first equation is the form used for computing no. The second equa-
tion shows that no is always less than the arithmetic mean ii of the nj ,
although usually only slightly less.
Consequently, if Sb 2 and S2 are the mean squares between and within
classes., respectively, unbiased estimates of the two components ofvariance
(12 and q A. 2 are given by

{j.' = (s.' - s')/no


With unequal n j • some mathematical complexities arise that have not
yet been overcome, in a form suitable for practical usc. The estimate
{j / , while unbiased whether the Aj and Ejj are normally distributed or
not, is not fully efficient unless (1 A 2 is small. The method given for finding
confiMnce limits for a/with equal n (section 10.15) does not apply. An
ingenious method of finding confidence limits for the ratio (J//u 2 was,
however, given by Wald (21). Whenever feasible, it pays to keep the
sample sizes equal.
EXAMPLE 1O.18.1~ln research on artificial insemination of cows. it series of semen
samples from a bull are sent out and tested (or their ability to produce conceptions. The
following data from a larger set kindly supplied by Dr. G. W. Salisbury. show the per·
centages of conceptions obtained from the samples for six bulls. In the analysis or ... ariance.
the total sum of squares, uncorrected, was 111,076. Venfy the analysis ohariance. the value
of no. and the estimates of the two variance components. (Since the data are percentages
based on slightly differing numbers of tests, the assumption that (J1 is constant in these data
is not quite correct)

Percentages of Conceptions to Services


Bull (i) for Successive Samples n, X,.

1 46,31,37.62, JO 5 206
2 70, 59 2 129
3 52, 44, 57, 40, 67; 64, 70 7 394
4 47,21,70,46, 14 5 198
5 42,64, SO, ffJ, 77, 81, 87 7 470
6 35,68,59,38,57,76,57,29,60 9 479

Total 35 1876

Source d,f. S.S. M.S. ElM.S.)

Between bulls 5 3.772 754 + 5.67(1... l


..'
(11

Within bulJs 29 6,750 233

.\,1 = 233 estimates 0'1: (154 _ 233)/5.67::: 92 estimates (1... 1


291
EXAMPLE 10. 18.2-The preceding example is one in which we might consider either
fixed or random effects of bulls, depending on the objectives. If these si:l bulls were available
for an artificial insemination program, we would be interested in comparing the percentages
of success of these specific bulls in a fixed effects analysis.
IO.19-Sampies witltiD samples. Unequal sizes. Both samples and
sub-samples may be of unequal sizes. Computational methods for any
number of levels (samples, sub-samples, sub-sub-samples, etc.) have been
developed by Gower (22) and by Gates and Shine (23), following earlier
work by Ganguli (24). The analysis of variance is straightforward al-
though tedious. A general procedure for finding unbiased estimates of
the components of variance at each level will be given.
Our example is from a small survey of wheat yields in six districts in
England (25). One or more farms were selected in each district, and from
one to three wheat fields from each selected farm. Strictly, this is a mixed
model, since the districts are fixed; further, the farms within districts were
not randomly selected. The data serve, however, to illustrate the com-
putations.
The computations are most easily followed if the data are set out as
in table 10.19.1. The lowest level (fields) is denoted by O. The yield.
X o., and tlie number of observations in each yield are written down.
In this example, as in most applications, the No. are all I, each observa-
tion being tbe yield of one field.
The Xo• and the No. are added to give the totals, X" and N", at the
next lowest level, farms. Similarly, the X" and the N" are added to give
the district totals, X,. and N,.. Finally. the district totals are added to
give X" and N", the grand total and the total number of recorded ob-
servations, respectively.
To obtain the sum of squares in the analysis of variance, first calculate
for each level the quantity

Si = 2.:, Xi/IN"
SJ' for instance. is (1063)' :36. = 31.388.0, the usual correction for the mean.
At.level 2 (Districts) we have

S2 = 1102 /4 + 91'13 + ... + 432 2;13 = 31,849.3


To obtain the df" count the number of classes C, at each level. These are
Co = 36. C, = 25, C, = 6, C, = 1, as shown at the foot of table 10.19.1.
The C, and the S. provide the df, and the sums of squares in the analysis of
variance. as shown in table 10.19.2 on p. 293.
The rule for cal~'Ulating the d,f and the sums of squares is a straight-
forward extension of the rule for two levels given in table 10.121.
We now express the expected values of the three mean squares in
terms of the components of variance for districts ("/)' farms (",'), and
field, ("0'). For this we use two sets of auxiliary quantities, 1,; and k ij •
292 CIJapIer 10: O_Way CI-mc..tiom. Aloa/yJi. of Variance
TABLE 10.19.1
WHEAT YIELDS (GMS. PER 0.0000904 ACRE) TO ILLUSTlt.ATE EsrtMATiON OF CoMPoNENTS
Of VARIANCE IN NESTED CussmcAnoNS WITH UNEQUAL NUMBEIt.S

LevoI 0
Fields
Level I
Farms
Level 2
Districts
. Level 3
GA.nd Total
X" N.. XII Nil X" N" X.. N..
23 1
19 1 42 2
31 1
37 1 68 2 110 4
33
:!9
29
II 62
29
2
1 91 3
36 1
29 1
.33 I 98 3 98 3
II 1
21 I 32 2
23 I
18 I 41 2
33 I 33 1
23 I 23 1
26 1 26 1
39 I 39 1
20 I 20 1
24 I 24 1
36 I 36 1 274 II
25 I
33 I 58 2 58 2
28 1
31 1 59 2
25 I
42 I 67 2
32 \
36 I 68 2
41 I 4\ I
35
16 ,
I 35
16 "' 'I
1
30 1 30 1
40 1 40 1
32 I 32 1
44 I 44 I 432 \3 1063 36

C, 36 2S 6 1

For the y,j,·i and j take the values.O, 1,2,3, with i <':.j. In the diagonal,
y" always equals the total number of observations, in this case 36. Further,
when .11 No. are I, YiO = C" the number of classes at level i. Thus, we
write I, 6, 25, and 36 in the column YiO in table 10.19.3. For the remaining
l'u, the rule is (using table 10.19.1):
Sum the squares of the Nil' each square divided by Ihe nexl enlry N Ii.
at level i. It sounds puzzling but should be clear from the examples.
293
TABLE 10.192
ANAL VSIS OF V AlUANCE Of WHEAT YlELD5

Source of Variation IlJegrees of freedom I Sum of Squares


I ,
I,. Mean Squares

Districts (level 2) C.-,c,- 5 S1 - S3 = 461.3 92.3


Farms within districts (level I ) C, _·C. -19 I SI - S2 = 1.349.5 71.0
Co'_ C1 -I1
Fields within farms (level 0)
l 5. - 5, - 310.2 28.2
- L_ __

'l:ABLE 10.19.3
VAl.tlES Of AUXlLlARY QuANTITIES Y'J AND kif

j j

0 '" 2 3 0
k"
1 2

3 1 1.67 9.11 36 2 5 9.82 26.89


2 6 11.49 36 1 19 2451
1 25 36 0 11
0 36

y" = (4' +.~' + 3' + II' + 2' + 13')/36 = 9.11


For the k,p i and j take the values 0, I, 2, with i ? j, and
kij = Yij - 1',+ 1.)
That is, to find any ku, start with Y'J and subtract the number immediately
above it. Thus, k" = 36 - 9.11 = 26.89.
The quantity k,; is the coefficient of (1/ in the expected value of the
sum of squares at level i in the analysis of variance. To find the expected
values of the corresponding mean squares, divide by the number of df.
at level i. These mean squares (from table 10.19.2) and their expected
values appear in table 10.9.4. For example, the coefficient 1.290 of (II'
in the farms mean square is k l l /19 = 24.51/19, and so on.
TABLE 10.19.4
EXPECTED VALUES OF THE MEAN SQUARES

levd Degrees of Freedom Mean Square E>pected Value

Districts (j = 2) 5 92.3 110


1
+ 1.964a 11 +5.3780/
Farms (i = 1) 19 71.0 (10
1
+ 1.29Ocf 1 1
Field, (i 0)= 11 28.2 6.'
A new feature is that the coefficient of (I,'
is no longer the same in
the Districts and Farms mean squares. Thus, the ratio 92.3/71.0 cannot
be used as an F-test of the null hypothesis (1/ = O. However, unb,ased
~ CIKrpIer 10: One-Way CIClllillcatiom. AIIaIysio 01 Vcrianc.
estimates of the three components are obtained from table 10.19.4 as
follows:
S02 = 28.2 : S1 2 =(71.0 - 28.2)/1.290= 33.2
s/ = [92.3 - 28.2 - (1.964)(33.2»)/5.378 = -0.02
The data give no evidence of real differences in yield between districts.
This method of calculation holds for any number of levels. For
large bodies of data the computations may be programmed for an elec-
tronic computer.
10.ZO-Inuaclass correJatioo. We revert to a single classification
with n members per class. When the component (fA 2 > 0, we have seen
that members of the same class tend to act alike. An alternative to model
II for describing this situation is to suppose that the observations x,j are
all distributed about the same mean p. with the same variance (f2, but that
any two members of the same class (i ~ constant) have a common cor-
relation coefficient PI' called the intraclass correlation coefficient. Actually,
this model antedates the analysis of variance.
With this model it can be shown by algebra that the expected values
of the mean squares in the analysis of variance are as follows:

Source of Variation Mea_Square Expected Value

Bctwecn'claSscs ~'{l +(n-I)p,)


Within classes ~'(l - p,)

This model is useful in applications in which it is natural to think of mem-


bers of the same class as correlated. It is frequently employed in studies of
twins (n ~ 2). The model is more general than the components of
variance model. If PI is negative, note that S, 2 has a smaller expected
value than S. 2 • With model II, this cannot happen. But if, for instance,
four young animals in a pen <:!Impete for an insufficient supply of food,
the stronger animals may drive away the weaker and may regularly get
most of the food. For this reason the variance in weight within pens
may be larger than that between pens, this being a real phenomenon
and not an accident of sampling. We say that there is a negative correla-
tion PI between the weights within a pen. One restriction on negative
values of PI is that PI cannot be less than - J/(n - I). This is so because
the expected value of S, 2 must be greater than or equal to zero.
From the analysis of variance it is clear that (s.' - sw2 ) estimates
npJ<12, while {S,2 + (n - l)s.') estimates ",,2. This suggests that as an
estimate of PI we take
rl ~ (s.' - sw2)/{s/ + (n - l)sw 2 ) (10.20.1)
As will be seen presently. a slightly different estimate of P, is obtained
when we approach the problem from the viewpoint of correlation.
The data on identical twins in table 10.20.1 illustrate a high positive
295
TABLE 10.20.1
NUMBER OF FINGE:R RIDGES ON BoTH HANDS Of INDIVIDUALS IN l2 PAIRS
Of FEMALE IDENTICAL TWINS
[Data from Newman. Freeman. and Holzinger (34)]

Finger Ridges Finger R.idges Finger Ridges


Pair of Individuals Pair of Individuals Pair of Individuals

I 71. 71 5 76. 70 9 114. 113


2 79. 82 6 83. 82 10 94. 91
3 105. 99 7 114. 113 II 75. 83
4 115. 114 8 57. 44 12 76. 72

Analysis of Variance

Source of Variation Degrees of Freedom Mean Square


Twin pairs II 817.31
Ind;viduals 12 14.29

$1 = 14.29, s,,/ = 401.51. rJ = 0.966

correlation. The numbers of finger ridges are nearly the same for the two
members of each pair but differ markedly among pairs. From the analysis
of variance. the estimate of p, is (n = 2)
" = (817.31 - 14.29)/(817.31 + 14.29) = 0.966
In chapter 7. the ordinary correlation coefficient between X and Y
was estimated as
, = 1:(X - X)(Y - Y)/.j{1:(X - X)'1:(Y _ Y)'}
With twin data, which member of a pair shall we call X and whi~h
Y? The solution is to count each point twice, once with the first member
of a pair as X, and once with the first member as Y. Thus, pair 2 is entered
as (79,82) and also as (82, 79), while pair I, where the order make, no
difference, is entered as (71, 71) twice. With this method the X and Y
samples both have the same mean and the same variance. If (X, X')
denote the observations for a typical pair, you may verify that the cor-
relation coefficient becomes
'r' = 2l:(X - X)(X' - X)/I1:(X - X)' + 1:(X' - X)'}
where the sums are over the a pairs and X is the mean of all observations.
For the finger ridges, ,,' = 0.962.
With pairs (n = 2). intraclass correlations may be averaged and may
have confidence limits set by using the transformation from, to z in
section 7.7. The only changes are: (i) the variance of z, is 1/(0 - 3/2).
where a is the number of pairs. as against I/(a - 3) with an ordinary z,
(ii) the cOrIection for the bias in =, is to add 1/(20 - I).
With triplets (n = 3), each trio X, X', X" specifies six points: (X. X'),
(X', X), (X. X"), (X", X). (X', X"), (X", X'). The number of points rises
296 Chopter 10: One-Way CIauifI.."'..... AIoaIyoio of Variance
rapidly as n rises, and this method of calculating ,,' becomes discouraging.
In 1913, however, Harris (26) discovered a shortened process similar to
the analysis of variance, by showing in effect that
2
,, , -- (a -(aI)s.'
- I)s.' - aS w
+ a(n - l)sw 2

Comparison with equation 10.20.1 shows that 'r differs slightly from '"
the difference being trivial unless a (the number of classesl is small. Since
it is slightly simpler, equation (10.20.1) is more commonly used now as
the sample estimate of Pl.
IO.21-Tests of homogeneity ohariance. Fromtime to time we have
f'dised the question as to whether two Or more mean squares differ signifi.
cantly. For two mean squares an answer, using the two·tailed F·test,
was given in section 4.15. With more than two independent estimates of
variance, Bartlett (27) provided a test.
If there are a estimates S,',
each with the same number of degrees of
freedom/. the test criterion is
M = 2.3026/(a log" - :E log s/) (.2 = ts/la)
The factor 2.3026 is a constant (log. 10). On the null hypothesis that each
S( is an estimate of the same (1', the quantity MIC is distribllted approxi-
mately as X' with (a - I) df, where

C -I a+1
- +--
3aj
Since C is always slightly greater than I, it need be used only if M lies
close to one of the critical values of x'
In table 10.21.1 this test is applied to the vanances of grams of rat
absorbed in the four types'o( fat in the doughnut example of table 10.2.1.
=
Here a = 4 and 1 5. The value of M is 1.88, clearly not significant
with 3 d,j. To illustrate the method, X2 = MjC ~ 1.74 has also been
computed.
When the degrees of freedom differ, as with samples of unequal
sizes, the computation of x' is more tedious though it follows the same
pattern. The formulas are:
M = (2.3026)[(!:};) log,' - !:f. log s,z] W = !:f,s,'/!:f,)
C=I+ 1 [!:_1_ _ _
I]
3(a - I) !. :Ef,
.,_' = MIC with (a - 1) degrees of freedom
In table 10.21.2 this test is applied to the variances of the birth weights
oftive litters of pigs. Since s' is the po.qled variance (weighting by degrees
of freedom), we need a column of the sums of squares. A column of the
reciprocals Ilfi of the degrees of freedom is also useful in finding C. Tbe
297
TABLE 10.21.1
CONPVTATlO'f'iI Of 8AltTLETT'S Tan Of HOIIIIQG.£NI!:I"n Of VltklA'NCE
Au. EST..... rES HA ""'" r- 5 DIImuiIIs 01' FREEDOM

Fa' ,,' lOB $,2

I 178 2.2504
2 60 1.7781
3 91 1.991 !
4 61 1.8375

Total 404 7.8522


" = 1.00.9 lOS 51 = 2.0038
M = (2.3026)(5)[<1(2.0038)- 7.85221- 1.&8. (df. = 3)

x' = 1.&8/1.083 ~ ).74 (d., = 31. P > 0.5

computations give 'I.' = 16.99 with 4 d,[.. showing that the innalitter
variances differ from litter to litler in these data.
When some or all of the s.' are less than I. as in these data. it is
worth noting that X' is unchanged if all s;' and 5' are multiplied by-tbe
the same number (say 10 or 100). This enables you to avoid logs that are
negative

TABLE to.21.2
COMPUTA.lION OF BAR TLFTT'S TEST Of HOMOOENEIfY OF.V ,uIANCE.
SAMPLES DlFFU1NG IN SIZE

Sum of o.grees of Moon


Litter Square') Freedom Squares Reciproca Is
(S&mpl<1 j{./ J. S;l lOISjl 1; log s/ I J.
I 8.18 9 0.909 -0.0414 - 0.3726 0.1111
2 3.48 7 0.497 -0.3036 -'2.1252 0.1429
J 0.68 9 0.076 - 1.1 192 -10:0728 O.t I II
4 0.72 7 0.103 -O.9R72 - 6.9104 0.1429
5 0.7.1 5 0.146 -0.8357 - 4.1785 0.2000

a-5 13.79 37 - 23.6595 0.7080

,'= Ir,s,'/!.r. = 13.79/37 = 0.3727


(!J;) Jog -" = (37)( -0.4286) = - IHj8~
At = (2.J026)(!J.) ~,' -l:.f,lo! ','1
= (2.302611- 15.85'2 .- (-23.6595)] = 17.96

C = 1 .... _I__ fO.7080 -


(3)(4) [
.!.]
37
= 1.057

I' = MIC - 17.9611.057 = 16.9'>. (<if = 41.P< 0.01


2'18 CItapter 10: a ......Way a-iIIcatNw. Analysis 0( Variaftc:.

The X' approximation becomes less satisfactory if most of the I. are


less than 5. Special tables for this case are given in (28). This reference
also gives a table of the significance levels of s...,. 'ISm••', the ratio of the
largest to the smallest of the a variances. This ratio provides a quick test
cf homogeneity of variance which, though less sensitive than Bartlett's
test, will often settle the issue.
Unfortunately, both Bartlett's test and this test are sensitive to non-
normality. in the data, particularly to kurtosis (29). With long-tailed
distributions (positive kurtosis) the test gives too many erroneous verdicts
of heterogeneity.
REFERENCES
I, 8. Lowe. Data from the Iowa Agricultural Experiment Station (1935).
2. R. RICHA.lu>SON,'I?tal.. }, Nutrition, 44:371 (1951).
3. G. W. SNEDECOR. Analysis a/Variance and CovarUmcf!. Collegiate Press, Inc •• Ames,
Iowa (l934).
4. R. A. FISHER and F. YATES. Slatistical Tables. Oliver and Boyd, Edinburgh (1938).
5. T. R. HANSBERRY and C. H. RrCkAJtO$:}N. Iowa Slote Coll. J. Sci., 10:27 (935).
6, ROTHAMSTED EXPERIMENTAL STATION REPoJlT: p. 289 (J936).
7. ROTHAMSTED EXPERIMENTAL STATtON REPORT: p. 212 (1937).
8. R. A. FISHER. TIte Design of Experiments. Oliver and Boyd, Edinburgh (J935}.
9. Query in Biometrics, 5:250 (1949).
10. D. N~WMAN. Biometrika, 31 :20 (1939).
J 1. H. S<-HEF"FIi. The Analysi.r of Variance. Wiley, New York (1959).
12. 0.8. DUNCAN. Ann. Math. Statist., 32: 10n(I96I).
13. T: E. KURTZ, B. F. LINK, J. W. TuKEY. and D. L. WALLACE. T~chnOrM/r;cs. 7:9S
(1965).
14. G. E. P. Box. Ann. Math. Stat;sl., 25:290 (1954).
15. P. C. TANG. Statist. Res. Memoirs, 2: 126 (1938).
16. E. S. PEARSON and H. O. HAJttLEY. Biometrika, 38': il2 (1951).
17. "Studies of Sampling T~hniques and Chemical Ana\)"es of Vegetables." Southern
Coop. Sor. Bull, 10 (1951).
18. S. MORIGUTI. R~ports of Statistical Applicatio'4 in Research, Japanese Union of Sci·
entists and Engineers. Vol. 3, No. 2:29 (1954).
19. I. D.I. BRoss. 8iometrfcs, 6: 136 (195<J}.
20. t. P. WINSOk and G. L. Ct..ARKE, J. Marine Res., 3: I (1940).
21. A. WALD. Ann. Mall!. Statist., 11 ;96 (1940).
22. J. C. GoWER. Biometrics. 18:537 (1962).
23. C. E. GATES and C. SHINE. Biometrics. 18:529 (1962).
24. M. GANGULi. Sankhya. 5:449 (1941).
25. W. G. COCHltAN. Jour. Amer. Statist. Ass., 34:492 (1939).
26. J. A'. HARRIS. Biometrika. 9:446 (1913).
27. M. S. BAIlTLETT. Jour. Royal Statist. Soc. Suppl., 4: 137 (1937),
28. E. S. PEARsoN and H. O. HARTLEY. Bio~trika Tables/or Statisticians. Vol. I. Tabfcs
31 and 32. Cambridge University Press (1954).
29. G. E. P. Box. B;om.,,;ka. 40:319 (1953).
30. H. O. HA.R1'1..fY. Communications on Pure and AppJ. MOlh., 8:47 ()955).
31. M. KEULS. Euphy,;ca. I: 112 (1952).
32. D. B. DUNCAN. Technom(trics, 7: 171 (1965).
33. L. N. BALAAM. Australian J. Statist .• 5;62 (1963).
34. H. H. NEWMAN, F. N. FREEMAN, and K. J. HOLZINGER. Twins. University of Chicago
Prcss (1937).
* CHAPTER ELEVEN

Lo-way classifications

lI.I-lntroduction. The experimenter often acquires the ability to


predict roughly the behavior of his experimental material. He knows
that in identical environments young male rats gain weight faster than
young female rats. In a machine which subjects five different pieces of
cloth to simulated wearing, he learns from experience that the cloths placed
in positions 4 and 5 will receive less abrasion than those in the other posi-
tions. Such knowledge can be used to increase the accuracy of an experi-
ment. If there are a treatments to be compared, he first arranges the
experimental units in groups of a, often called replications. The rule is
that units assigned to the same replication should be as similar in re-
sponsiveness as possible. Each treatment is then allocated by randomiza-
tion to one unit in each replication. This produces a two-way classifica-
tion, since any observation is classified by the treatment which it received
and the replication to which it belonged.
Two-way classifications are frequent in surveys also. We already en-
countered an example (section 9. L') in which farms were classified by soil
type and owner-tenant status. In a survey of family expenditures on
food, classification of the results by size of family and income level is
obviously relevant.
We first present an example to familiarize you with the standard
computations needed to perform the analysis of vananee and make any
desired comparisons. Later, the mathematical assumptions will be
discussed.
II.2-An experiment with two criteria of classification. In agricul-
tural experiments the agronomist tries to classify the plots into replications
in such a way that soil fertility and growing conditions are as uniform as
possible within any replication. In this proce"s he utilizes any knowledge
that he has about fertility gradients, drainage, liability to attack by pests,
etc. One guiding principle is that, in general, plots that are close together
tend to give similar yields. Replications are therefore usually compact
areas of land. Within each replication one plot is assigned to each treat-
ment at random. This experimental plan is called randomized blocks, the
299
300 Chapt., II: Two-Way Clasrillcation.
replication being a block of land. The two criteria of classification are
treatments and replications.
Table 11.2.1 comes from an experiment (I) in which four seed treat-
ments were compared with no treatment (Check) on soybean seeds. The
data are the number of plants which failed to emerge out of 100 planted in
each plot.

TABLE 11.2.1
ANALYSIS OF VARIANCE OF A 2-WAY CLASSIFICATION
(Number of failures out of 100 planted soybean seeds)
==~~~T~======================9=============
Replication
Treatment 2 3 4 5 Total Mean
Check 8 10 12 13 II 54 10.8
Arasan 2 6 7 II 5 31 6.2
Spergon 4 10 9 8 10 41 8.2
Semesan. Jr. 3 5 9 10 6 33 6.6
Fermate 9 7 5 5 3 29 5.8

Total 26 38 42 47 35 188

Correction: C _ (188)'(25 _ 1,413.76


Total S.S.: 82 + 22 + ... + 6 2 + 32 - C =- 220.24
542 + 31 2 +.,. + 29 2
Treatments S.S.: - C= 83.84
5
262 + 38 2 + ... + 3S 1
Replications S.S.: - C- 49.84
5

Source of Variation Degrees of Freedom Sum of Squares MeanSquarc

Replications 4 49.84 12.46


Treatments 4 83.84 20.96
Residuals_(Error) 16 86.56 5.41

Total
.... 24 220.24

The first steps are to find the treatment totals, the replication totals,
the grand total, and the usual correction for the mean. The total sum of
squares and the sum of squares for Treatments are computed just as in a
one-way classification. The new feature is that the sum of squares for
Replications is also calculated. The rule for finding this sum of squares
is the saOle as for Treatments. The sum of squares of the replication totals
is divided by the number of observations in each replication (5) and the
correction factor is subtracted. Finally, in the analysis of variance, we
compute the line
Residuals = Total - Replications - Treatments
301
As will be shown later, the Residuals mean square, 5.41, with 16 df"
is an unbiased estimate of the error variance per observation.
The F ratio for treatments is 20.96/5.4J = 3.87, with 4 and J6 d.f,
significant at the 5% level. Actually, since this experiment has certain
designed comparisons, discussed in the next section, 11.3, the overall
F-test is not of great importance. Note that the Replications mean
square is more than twice the Residuals mean square. This is an indica-
tion of real differences between replication means, suggesting that the
classification into replications was successful in improving accuracy. A
method of estimating the amount of gain in accuracy will be presented in
section 11.7.
EXAMPLE 11.2.1-ln three species of citrus trees the ratio of leaf area to dry weight
was determined for three conditions of shading (2).

Shading Sbamouti Orange Marsh Grapefruit Clementine Mandarin

Sun 112 90 123


Halfohade 86 73 89
Shade 80 62 81

Compute tbe analysis of variance. Ans. Mean squares for shading and error, 942.1 and
21.8. F= 43.2, with 2 and 4 d.f. The shading was effective in decreasing the relative leaf
area. See example 11.5.4 for further discussion.
EXAMPLE 11.2.2-Wben there are only two treatments, the datil reduce to two flaired.
samples, previously analyzed by the I·test in chapter 4. This ,-test is equivalent to the F-t~t
of treatments as given in tbis section. Verify this result by perfonning the analysis of
variance of the mosaic vitus example in section 4.3, p. 95, as follows:

Degrees of Freedom Sum of Squares Mean Square

Replications (Pairs) 7 S7S 82.2


Treatments I 64 64.0
Error 7 65 9.29

F= 6.89, df. = 1,7. JF- 2.63 - t as given on p. 94.

1l.3-Comparisoos among means. The "lliscussion of differenJ types


of comparisons in sections 10.7 and 10.8 applies also to two-way classifica-
tions. To illustrate a planned comparison, we compare the mean number
of failures for Ihe Check with the corresponding average for the four
Chemicals. From table 11.2.1 the means are:
Check Arasan Spergon Semesan. Jr. Fermate
10.8 6.2 8.2 6.6 5.8

The comparison is, Iherefore,


_ 6.2 + 8.2 + 6.6 + 5.8 _ _ _ I
1O.8 4 - 10.8 6.7 - 4.
302 Oapf... II: T_Woy C'-HIcalicNts
The experiment has five replications, with s = J 5.41 = 2.326 (16 dj.).
Hence, by Rule 10.7.1, the estimated error of the above difference is

s ) 2 1 1 I 1 _ (2.326»)5
J5 1 + 4' + 4' + 4' + 4; - ~ 4
= 2.326/2 = 1.163
with 16 df Thus 95?1. confidence limits for the average reduction in
failure rate due to the Chemicals are
4.1 ± (2.120)(1.163) = 4.1 ± 2.5, i.e., 1.6 and 6.6
The next step is to compare the means for the four Chemicals. For
this, the discussion in section 10.8 is relevant. The LSD is
lo.o,S.j2/n = (2. I 20)(2.326),j275 = 3.12.
Since the largest difference between any two means is 8.2 - 5.8 = 2.4.
there are no significant differences among the Chemicals. You may verify
that the Studentized Range Q-test requires a difference of 4.21 for sig-
nificance at the 5% level, giving, of course, the same verdict as the LSD
test.
1l.4-A1gebraic notation. For the results of a two-way classification
table 11.4.1 gives an algebraic notation that has become standard in
mathematical statistics. X'j represents the measurement obtained for the
unit that is in the ith row (treatment) andjth column (replication). Row
totals and means are denoted by X,. and Xi .. respectively, while X.; and
X'j denote column totals and means. The overall mean is X ... General
instructions for computing the analysis of variance appear under the
TABLE 11.4.1
ALGEBR.AiC REPRESENTATION OF A 2·W",y TABLE WITH a TREATMENTS ANb h REPlI(,ATlONS
-.., (Computing instructions and analysis of variance)

Treatments Replications. j = I .. . b
i= I .. . Q' j h Sum Mean

I Xu X" X" XI' X,.


2 X" XlJ X" X,. r,.

x" X" X,. Xi'

X.; X. X,. X,
Sum X' i X., X., x..
Mean .v' l X, X., X..
303
TAoBLE 11.4.1 (Conzinuf>d)

Correction:
Total:

Treatments:
2
Replications: B = X'J + ... +X./ - C
a
Residuals: D = Total - (Treatments + Replications)

Source of Variation Degrees of Freedom Sum of Squares Mean Square


Treatments a-I A A/(a - I)
Replications b-I B B/(b -, I)
Residuals (a - I)(b - I) D D/(a - I)(b - I)

Total ab-I A.+B+D

table. Note thatthe number of dj. for Residuals (Error) is (a - I)(b - I),
the product of the numbers of dJ. for rows and columns.
In this. book we have kept algebraic symbolism to a minimum, in
order to concentrate attention on the data. The symbols are useful, how-
ever, in studying the structure of the two-way classification in the next
section.

U.S-Mathematical model for • two-way c_iflcation, The model


being used is
Xi} = p. + ~, + Pi + "i' i = 1 ..• 'I. j = 1 ... b,
where p. represents the overall mean, the ~, stand for fixed row (treatment)
effects and the Ilj for fixed column (replication) effects. The convention
:E~, = :EPi = 0
is usually adopted.
This model involves two basic assumptions:
I. The mathematical form (p. + el, + Ilj ) implies that row and column
effects are additive. Apart from experimental errors, the difference in
effect between treatment 2 and treatment I in replication j is
(p. + "2 + Pi) - (p. + ct, + Pi) = ct2 - ",

This difference is the ..arne in all replications. When we analyze real data,
there is no assurance that row and column effects are exactly additive.
The additive model is used because of its simplicity and because it is
often a good approximation to more complex types of relationships.
2. The tlj are independent random variables, normally distributed
with mean 0 and variance ,,'. They represent the extent to which the
data depart from the additive model because of experimental errors.
304 Chapter If: Two-Way Ct-ili.ationt
As an aid to understanding the model we shall construct a set of data
by its use. Let
Jl = 30
(Xl = 10, tI2 == 3, tt) = 0, IX. =- -13; LlXj = 0
fi, = I, fi2 = -4, fiJ = 3; 1:.P, = 0
The e;j are drawn at random from table 3.2.1, each decreased by 30. This
makes the e'j approximately normal with mean 0 and variance 25.
In each cell of table 11.5.1, ~ = 30 is entered first. Next is the treat-
ment ai, differing from row to row. Following this is the replication
effect, one in each column. In each cell, the sum of these three parts is
TABLE 11.5.1
ExPFJt.IMENT CoNSTllUCTED ACCORDING TO MODEL l. Jl = 30

Replication
Treatment p, - I p, =-4 p, = 3 X,. X,.
-
or: l = 10 30 30 30
IO 10 10
1 - 4 3
-II - 7 3

XII =30 Xu -29 X u ..... 46 105 35

«3 = 3 30 30 30
3 3 3
1 - 4 3
1 5 - 3
--- .---
Xu = 35 Xll:::lO: 34 Xl l = 33 102 :14

CI) "" 0 30 30 30
0 0 0
1 - 4 3
"-
0 4 - 1
Xli = 31 X 12 == 30 Xu = 32 93 31

1%4==-13 30 30 , 30
-13 -13 -13
1 - 4 3
- 2 - 2 1
---
X,41 z:: 16 X.:a '= II X.] = 21 48 16

X'j 112 104 132 348

X' I 28 26 33 29

Source of Variation iDegrees of Freedom Sum of Squares Mean Square

Replications
Treatments
.'3 104
702
52
234
Residuals 6 132 22
305
fixed by 1', the ~i' and the Pi" Sampling variation is introduced by the
fourth entry, a deviation drawn at random from table 3.2.1. According
to the model, Xij is the sum of the four entries just described.
Some features of the model are now apparent:
(i) The effects of the treatments are not influenced by the Il i because
the sum of the Pi in each row is zero. If there were no errors, check from
table 11.5.1 that the sum for treatment I would be 41 + 36 + 43 = 120,
the mean being 40 = I' + Ill' The observed mean, Xl = 35, differs from
40 by the mean of the eii , namely (_ II _ 7 + 3)/3 = _ 5. This is an
instance of the general result

Xi' = I' + lXi + (ei! + e" + ... + e,,)lb


This result shows that Xi. is an unbiased estimate of I' + ~i and that its
variance is (['Ib, because the error of the estimate is the mean of b inde-
pendent errors, each with variance (12.
(ii) In the same way, the replication means are unbiased estimates of
i' + p" with variance ([' la.
(iii) In the analysis of variance the Residuals mean square, 22, is an
unbiased estimate of ([' = 25. More explanation on this point will be
given presently.
(iv) The mean square for Replications is inflated by the Pi and that
for Treatments by the lX i . The expected values of these mean squares
are shown in table 11.5.2, which deserves careful study. Note that the
expected value of the Treatments mean square is the same as in a one-way
classification with b observations ih each class (compare with equation
10.4.1, p. 265).

TABLE 11.5.2
COMPONENT ANAL ¥SIS OF THE CONSTRUCTED EXPERIMENT

Expected Value
Source of Variation Degrees of Freedom Mean Square ,
(Pan,meters Estimated)

Replications 2 52 (12 + aK. 2


Treatments
Residuals

, tp/ (I)' + (-4)' +


3
6

(3)'
214
22
.'
u 2 + bK,./

1(. = -_ ~ 13
h - 1 2
K,
, ta/
~ -_ ~
(10)' + (3)' + (0)' + (-13)' ~ 92t
u - I 3
Sa 2 = (52 - 22)/4 = 8 estimates 13. SA
2
= (234 - 22l(3 == 71 estimates 92,

Error Mean Square =. 22 estimates 25


Replications Mean Square = 52 estlmates 25 + 4( 13) = 11
Treatments Mean Square = 234 estimates 25 + 3(93) = 304
306 Chap,.,. II, Two-Way C/auillcalions
We turn to the estimates of )1, <XI and fJj. These estimates are
{J. = X.. ; ~i = XI' - X.. ; Pi = X. J - X ..
If we estimate any individual observation X IJ from the fitted model,
the estimate is
gij = {J. + ~I + PJ = X .. + (Xi' - X .. J + (X. J - X .. J
= XI' + X. J - X.. .
Table 11.5.3 shows the original observations Xi» the estimates gii'
and the deviations ofthe observations from the estimates, DiJ = Xi) - gij'
For treatment 1 in replication 2, for instance, we have from table 11.5.1.
Xu = 29, gil = 35 + 26 - 29 = 32, D" =- 3

TABLE 11.5.3
LINEAR MODEL FllTED TO THE OBSERVATIONS IN TABLE 11.5 I
.
Replication
Treatment 1 2 3

I Xij 30 29 46
gij 34 3i 39
- 4 - J + 7
• D'I
2 Xij 35 34 33
kij 33 31 38

D'I + 2 + 3 - 5

3 X'I 31 30 32
kiJ 30 28 )5

D'I + 1 + 2 - 3

4 " Xii 16
15
11
13
21
20
til

Di) + I - 2 + I

The deviations Dij have three important properties:


(i) Their sum is zero in any row or column.
(ii) Their sum of squares,
t - 4)2 + ( + 2)2 + ... + ( - J)' + ( + 1)' = 132.
is equal to 'he Residuals sums of square~ in the analysis of variance at the
foot of table II 5.1. Thus the Residuals sum of squares measures the
extent 10 which the linear additive model fails to fit the data. This result
is a consequence qf a general algebraic identity:
307
Residuals S.s, = L L (Xu - X;, - X'j + X, ,)'
i j

=I I (Xu - X,,)' - bI (X" - X,,)' - aI (X'.j - X..)'


j j j

Total S,S, - Treatments S,S, - Replications S,S,


This equation shows that the analysis of variance is a quick method
of finding the sum of squares of the deviations of the observations from
the fitted modeL When the analysis is programmed for an electronic
computer, it is customary to compute and print the D ij' This serves two
purposes, It enables the investigator to glance over the D;i for signs of
gross errors or systematic departures from the linear model, and it pro-
vides a check on the Residuals sum of squares,
(iii) From the constructed model you may verify the remarkable re-
sult that
Dij = Eij - ei' - f.' j + i ..
For example, for treatment 1 in replication 2 you will find fnlm table 11.5.1,
e,l = -7; ." = -5; 8" = 0;." = -1
e,,'- t" - t'l +." (-7) - (-5) - (0) + (-1)
= = -3,

in agreement with DI2 = - 3 in table 11.5,3, Thus, if the additive model


holds, each D;i is a linear combination of the random errors, It may be
shown that any D,/ isan unbiased estimate of (a - 1)(6 - 1),,'ja6, It
follows that the Residuals sum of '<tuares is an unbiased estimate of
(a - l)(b - I),,', Tins gives the basic result that the Residuals mean"
square, with (a - 1)(6.- I) d.f, is an unbiased estimate of ,,',
To summarize the salient features, the additive model implies that the
treatment effects a i are the same in every replication~ and vice versa.
If additivity holds (apart from independent errors) the observed treatment
means are unbiased estimates of the treatment effects, The F-test may be
applied both to Treatments and Replications, The Residuals mean square
measures the extent to which the additive model fails to fit the data and
provides an unbiased estimate of (;2.
EXAMPLE Il.5.l-Suppose that with a ~ b ::= 2, treatment and replication etfecls are
mulfiplicQtivr. Treatment 2 gives results 20~~ higher than treatment I and reQtication 2
gives results 10% higher than replication I. With no random errors, the observations would
be as shown on the left below.

Xli g'l
Replication Replicatioll
Treatment I 2 Treatment I 2

I 1.00 1.10 I 0,995 1.105


2 1.20 1,]2 2 1.205 !.l15

Verify that the g!j given by fitting the linear model are as shown iln the ri,hl above.
Any DIj is only ±O.OOS. The linear model gives a good fillo a multiplicaUve model when

20
308 Chaplw II: Two-Way Claailications
treatment and replication effects are small or moderate. If, however, treatment 2 gives a
100% increase and repli~tion 2 a Siflo increase. you will find D jj = ±O.125. not so good a fit.
EXAMPLE 11.5.2-ln table 11.5.3, verify that tll = 35 Dl) =- 3.
EXAMPLE II.S.3-Perform an analysis of variance of the gij in table 11.5.3. Verify
that the Treatments and Replications sums of squares are the same as for the Xi}. but that
the Residual sum of squares is zero. Can you explain these results?
EXAMPLE 11.5.4-Calculate the DiJ for the 3 x 3 citrus data in example 11.2.1 and
verify that the Residuals mean square, computed from the Dll • is 21.8. Carry onc decimal
place in the DI).
EXAMPLE 11.5.5-The resuit,
Dt} = £1/- £,_ - 8.}+ i .. ,
shows that,D J} is a linear combination of the form l:l:Af)tij' By Rule 10.7.1, its variance is
1
11 l:l:Ai··
For D11 • for example, the Aij work out as follows:

Observations No. of Terms AOj

D" I (0 - I)(b - 1)lab


Rest of DIj (b - I) -(0 - ll/ob
Re~l. of Di1 (0 - I) -(b - 1)lab
Rest of Dij (0 - I)(h - I) + Ilab
1
It follows that l:I:)"ij = (a - I )(b - I )/ab. Thus Dl12 ~..-simi1arlY·-any Di/ estimates
(u - I)(b - I)ql,ab, as stated in the texL ~

1l.6-Partitlooing the treatments sum of squares. When the treat·


ments contain certain planned comparisons, it is often possible to parti-
tion the. Treatments sum of squares in the analysis of variance in a way
that is helpful. Some rules for doing this will now be given. In the
analysis of variance. comparisons are usually calculated from the treat·
ment totals Ti rather '"than the means, since this saves time and avoids
rounding errors.
Rule 11.6.1~1f L = ).1 T1 + ... + ).,T" (l:)', = 0) is a comparison
among the treatment totals, then
L'/nl:).'
is a part of the sum of squares for treatments, associated with a single de·
gree of freedom, where n is the number of observations in any treatment
totaL·
In the experiment on seed treatment of soybeans (table 11.2.1) the
comparison Check vs. Chemicals may be represented as follows:

Check Arasan Spergon Semesan, Jr. Fermate


- - - - - - - j - - - ---
Total(T;) I 54 31 41 33 29
I
Ai 4 -I -I -I -I
------.--- ----~---

To avoid fractions the..t, have been taken as 4, -1, -1, -1, -1 instead of
as 1, - 1/4, - 1/4, - 1/4, - 1/4 in section 11.3. This gives
L = 4(54) - 31 - 41 - 33 - 29 = 82
Since n = 5, the contribution to the Treatments sum of squares is
L'/n'£).' = (82)'/(5)(20) = 67.24 (1 dj.)
The Treatments sum of squares was 83.84 with 4 dj. The remaining part
is therefore 16.60 with 3 d.f. What does it represent? As might be guessed,
it represents the sum of squares of Deviations of the totals for the four
Chemicals from their mean, namely,
31 2 + 41' + 33 2 + 29' 134'
5 - 20 = 16. 60

Thus, the original analysis of variance in table 11.2.1 might be reported


as follows:
Source of Variation I Degrees of Freedom Sum of Squares Mean Square

Check vs. Chemicals I 67.24 61.24


Among Chemicals 3 16.60 5.53
Residuals (Error) 16 86.56 5.41

The F ratio 67.24/5.41 = 12.43 (P < 0.01) shows that the average failur.
rates are different for (;:heck and Chemicals (though as usual it does not
tell us the size and direction of the effect in terms of means). The F ratio
5.53/5.41 = 1.02 for Among Chemicals warns us that there are unlikely
to be any significant differences among Chemicals. as was already verified ..
As a second example, consider the data on the effect of shade on the
ratio of leaf area to leaf weigh! in citrus trees (example 11.2.1). The
"treatment" totals, n = 3, were as follows:

Half i !
Sun Sbade Shade Comparison S.S. -
Totals T, 325 248 223 L; Divisor L'J'jIfI.).l
-
Effect of shade A.li +1 0 -I 102 6 1734
Half shade vs. Rest Ali +1 -2 +1 52 18 ISO
--~-

We might measure the effect of shade by the extreme comparison


L, = (Sun - Shade). We might also be interested in whether the results
for Half Shade are the simple average of those for Sun and Shade. This
gives the comparison L,.
Rule 1l.6.2~Two comparisons:
Ll = A.lIT} + A12T2 + ... + A1QT,. = 1:1 1;7;.
L2'= )'HTI + ).22T2 + ... + A2.T. ~ LA1i 1j.
are orthogonal if.
A. 11 )'21 + A12).j2 + ... + ).10).211 = 0 .: i.e. l:).1jA,21 = 0
310 CIIapter II: T__ Way C/aai6c"'ioftl
In applying this rule, if a total T, does not enter into a comparison, its
coefficient is taken as zero.
The comparisons L. and L2 are orthogonal, since
(+-1)(+1) + (0)(-2) + (-1)(+1) = 0
Rule I J.6.J-If two comparisons are orthogonal, their contributions
L. 2/nI)..2 and L//nI)'/ are independent parts of the sum of squares
for treatments, each with I df
This means that the Treatments S.S. may be partitioned into the
contributions due to L. and L 2 , plus any remainder (with (a - 3) df).
A consequence of this rule is
Rule II.6.4-Among a treatments, if (a - 1) comparisons are mu-
tually orthogonal (i.e., every pair is orthogonal), then

~
L/ L/
+ - - , + ... + I..
L._.'
2 = Treatments S.S.
nl:l 1j , n:E12t n (a-1)1

The citrus data, with ii ". J, are an example. The sum of the squared con-
tributions for L. and L2 is 1734 + ISii = i 884, which may be verified to be
the Treatments S.S. Thus, the relevant part of the analysis of variance
can be presented as follows:

Effect of_.
Source of Variation

Half shade \'S. Reat


Decrees of Freedom Sum of Squares Mean Square
1
1
1731
ISO
87
1734
ISO
79.S
6.9
F

Error 4 21.8

The Fvalue for the ejfect of shade is highly significant. With I and 4 df,
F = 6.9 for the comparison of half shade with the average of sun and
shade does not quite reach the 5% level. There is a suggestion, however,
that the results for half shade are closer to those for shade than to those
for sun. Both th.... comparisons can, of course, be examined by I-tests
on the treatment means.
EXAMPLE 11.6.1-10 the following artificial example, two of the treatments were
variants of one type of proce&s, while the other four were variants of a second type. The
treatment totals (4 replications) were:
Process J Process 2
59 68 70 S4 76 81
Partition the Treatments S.S. as folloWS:

Source of Variation ) Degrees of Freedom Sum of SqU&CC$ Mean Square

Between 'processes 1 67.69 67.69


Variaats of process 1 1 10.12 10.12
Variants of process 2 3 28.19 9.40
311
1l.7-Efficieu.cy of blocking. When an experiment has been setout in
replications, using the randomized blocks design, it is sometimes of in·
terest to know how effective the blocking was in increasing the precision
of the comparisons, particularly if there is doubt whether the criterion
used in constructing the replications is a good one, or if the use of these
replications is troublesome. From the analysis of variance of a random-
ized blocks experiment, we can estimate the error variance that would
have been obtained if a completely random arrangement of the same ex-
perimental units (plots) had been used instead of randomized blocks.
Call the two error variances sc.' and su'. With randomized blocks
the variance of a treatment mean is s..'lb. To get the same variance of a
treatment mean with complete randomization, the number of replications
n must satisfy the relation
SCR
, = SR.
, n SCR
2
or - = - -2
" b b SRB

For this reason the ratio .feR). iS RB2 is used to measure the relative efficiency
of the blocking.
If M .and M, are the mean squares for blocks and error in the analysis
of variance of randomized blocks experiment that has been performed, it
has been shown (3.4) that
sc.' (b - I)M. + bra - I)M,
-5.-.' = (ab - I)M E

Using the soybeans experiment as an example (table 11.2.1),


M. = 12.46, M, = 5.41, a = b = 5,
sc.' 4(12.46) + 20(5.41)
1.22
SRBi = 24(5.41)

With complete randomization, about six replications instead of five would


have been necessary to obtain the same standard error of a treatment
mean.
This comparison is not quite fair to complete randomization, which
would provide 20 df for error as against 16 with randomized blocks and
therefore require smaller values of I in calculating confidence intervals.
This is taken into account by a formula suggested by Fisher (5), which
replaces the ratio sc.'fsu ' by the following ratio:
. .. (JR. + 1)1./<-. + 3) sc.'
Relatlre amount of mformatlon = r 3 { --,
. . (JAB + )(Je. + I) s••
= (16 + 1)(20 + 3) .22 = 1.20
116+3)(20+1)(1 )
The adjustment for d.j. has little effect here but makes more difference in
small experiments.
312 Chapter 11: Two-Way Clauillcations
EXAMPLE 11.7 .I-In a randomized-blocks experiment which compared four strains
of Gallipoli wheat (6) the mean yields (pounds per plot) and the analysis of variance were as
follows:

Strain A B C D

Mean yield 34.4 34.8 33.7 28.4

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Blocks 4 21..46 5.36


Strains 3 134.45 44.82
Error 12 26.26 2.19

(i) How many replications were there? (ii) Estimate sci'/SIl..2, (iii) Estimate the relative
amount of information by Fisher's fonnula. Ans. (ii) 1.3{), (iii) 1.26.
EXAMPLE 11.7.2-(0 example 11.7.1, verify that the LSD and the Q methods both
show D inferior to the other strains, but reveal no dift'erences among the other strains.

n.s-Latin squares. In agricultural field experiments, there is fre-


quently a gradient in fertility running parallel to one of the sides of the
field. Sometimes, gradients run parallel to both sides and sometimes, in a
new field, it is not known in which direction the predominant gradient
may run. A useful plan for such situations is the Latin square. With four
treatments, A, B, C, D, it may be like this:
ABC D
CAD B
DeB A
B D A C
The rows and columns of the square are parallel to the two sides of the
field. Each treatment appears once in every row and once in every column,
this being the basic property of a Latin square. Differences in fertility
between rows and differences between columns are both eliminated from
the comparison of the treatment means, with a resultant increase in the
precision of the experiment.
In numerous other situations the Latin square is also effective in
controlling two sources of variation of which the investigator has predic-
tive knowledge. In psychology and medicine, the human subject fre-
quently comprises a replication of the experiment, receiving all the treat-
ments in succession, with intervening intervals in which the effects of pre-
vious treatment will have died away. However, a systematic effect of the
order in which the treatments are given can often be detected. This is
controlled by making the columns of the square represent the order,
while rows represent subjects. In animal nutrition, the effects of both
litter and condition of the animal may be removed from the estimates of
treatment means by the use of a Latin square.
To construct a Latin square, write down a systematic arrangement
313
of the letters and rearrange rows and columns at random. Then assign
lrealments at random to the letters. For refinements, see (7).
The model for a Latin square experiment (model I) is
Xij' =!' + a i + Pj +y, + .ij'; i,j and k = I ... a; .ij, = .Y(O, u)
where a, p, and y indicate treatment, row, and column effects, with the
usual convention that their sumS are zero. The assumption of additivity
is carried a step further than'with a two-way classification, since we assume
the effects of all three factors to be additive.
It follows from the model that a treatment mean X, .. is an unbiased
estimate of J1 + ai' the offects of rows and columns canceling out because
of the symmetry of the design. The standard error of X, .. is u/.Ja. The
estimate ~i)' of the observation X,j, made from the fitted linear model is

g,j1. = X ... + (X, .. - X ... ) + (X. j . - X ... ) + (X .. , - X ... )


Hence, the deviation from the fitted model is
D,j1. = X'j1. - g'j' = XIj, - X, .. - X. j. - X .. , + 2X ...
As in the two-way classification, the error sum of squares in the
analysis of variance is the sum of the Dij,/ and the Error mean square is
an unbiased estimate of a 2 .
Table 11.8.1 shows the field layout and yields of a 5 x 5 Latin square
experiment on the effects of spacing on yields of millet plants (8). In the
computations, the sums for rows and columns are supplemented by sums
TABLE 11.8.1
YIELDS (GRAMS) OF PLoTS OF MILLET ARRANGEt> IN A. LATIN SQUARE
(Spacings: A, 2-inch: B. 4: C, 6: D, 8: E, 10)

Column
Row I 2 3 4 5 Sum
---
I B: 257 E: 230 A: 279 C: 287 D: 202 1,255
2 D: 245 A' 283 E: 245 B: 280 C: 260 1,313
3 E: 182 B: 252 C: 280 D: 246 A: 250 1,210
4 A: 203 C: 204 D: 227 E: 193 B: 259 1,086
5 C: 231 D: 271 B: 266 A: 314 E: 338 1,440

Sum 1,118 1,240 1.297 1.340 1,309 6,304

Summary by Spacing

A: 2" B: 4" C: 6' D: 8" E: 10"


-- - - - - - - - - ----
~---_
Sum 1.349 1,314 1.262 I.191 1,188 6,304
---- --- -.--------- ,-._------ -- -
Me-an 269.8 2h2.8 252.• 238.2 237.6 15).]
- -_---- --------
(C{,nlinued next pag(')
314 Chap,.,. II: Two-Way CI_iIIcalion.
TABLE 11.8.1 (Continued)
Correction: (6,304)'/25 ~ 1,589,617
Total: (257)' + ... + (338)' - 1,589,617 = 36,571
R ows.. (1,255)' + ...
5 + (1,440)' _ 1, .=.
589617 13 601

(1,118)' + ... + (1,309)'


Columns: 5 1,589,617 = 6,146
(1,349)' + ... + (1,188)' ,
Spacings: 5 1,589.617 = 4,106
Error: 12,668

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Total 24 )6,571
Ro~ 4 13,601 3,400
Columns 4 6,146 1,536
Spacings 4 4,156 1,039
Error 12 i2,668 1,056

and means for treatments (spacings). By the usual rules, sums of squares
for Rows, Column, and Spacings are calculated. These are subtracted
from the Total S.S. to give the Error S.S. with (a - I)(a - 2) = 12 df
Table 11.8.2 shows the expected values of the mean squares, with the
usual notation. For illustration we have presented the results that apply
if the ftj and Y. in rows and columns represent random effects, with fixed
treatment effects eli
TABLE 11.8.2
COMPONENT ANALYSIS IN LATIN SQuARE

Source of Variation Degrees of Freedom Mean Square Estimates of


Rows, R
.... + ooi
4-1 (12
Columns, C a-I (12 + Qae?
Treatments. A .-1 6 2 +/IK",1
Error (.-1)(.-2) .'
This experiment is typical of many in which the treatments consist of
a series of levels of a variable, in this case width of spacing. The objective
is to determine the relation between the treatment mean yields, which we
will now denote by 1,., and width of spacing Xi' Inspection of the mean
yields suggests that the relation may be linear, the yield decreasing steadily
as spacing increases. The Xi' x j, and Y" are shown in table 11.8.3.
TABLE 11.8.3
DATA FOR CALCULATING THE REGRESSION OF YIELD ON SPACING

Spacing, XI 2"' 4" 6" 8" 10"

Xi = Ai-X -4 -2 o 2 4
y,. (sms.) 269.8 262.8 252.4 238.2 237.6
31S
The regression coefficient of yield on spacing is
1:(X, - X)(Y,. - Y..) 1:XiY,. 178.0
b= 1:(X, _ X)' = 1:x,' = - ~ = -4.45,
the units being grams per inch increase in spacing. Notice that b is a
comparison among the treatment means, with Aj = xJEx/. From Rule
10.7.1, the standard error of b i.
s. = ,j(s'1:).'/a) = ,j(s'/a1:x') = ,j{(1056)/(5)(40») = 2.298.
With 12 dj., 95% confidence limits for the population regression are
+0.6 and -9.5 grams per inch increase. The linear decrease in yield is
not quite significant, since the limits include O.
In the analysis of variance, the Treatments S.S. can be partitioned
into a part representing the linear regression on width of spacing and a
part representing the deviations of the treatment means from the linear
regression. This partition provides new information. If the true regres-
sion of the means on width of spacing is linear, the Deviations mean square
should be an estimate of ,,'. If the true regression is curved, the'Devia-
tions mean square is inflated by the failure of the fitted straight line to
represent the curved relationship. Consequently, F = De~'iarions M.S,I
Error M.S. tests whether the straight line is an adequate fit.
The sum of squares for Regression (1 dj.) can be computed by the
methods on regression given in chapter 6. In section 6.15 (p. 162) this
sum of squares was presented as (1:xy)'/1:x' (table 6.15.3). In this exam-
plewe have already f01!nd l:xy = l:XiY., = -178.0, and I:x' = 40, giving
(I:xy)'/I:x' = (178.0)'/40 = 792.1. Since, however, each fi' is the mean
of five observations,. we mUltiply by 5 when entering this term in the
analysis of variance, giving 3,960. The S.S. for Deviations from the re-
gression is found by subtracting 3,960 from the total S.S. for Spacings,
4156 (table 11.8.4).
TABLE 11.8.4
ANALYSIS OF REGRESSION OF SPACING MEAN ON WIDTH,OF SPACING
(Millet experiment)

Source of Variation Degrees of Freedom Sum of Squares M~n Square F


Spacings (table 11.8.1) 4 4.156
{ Regression 3.%0 3.%0 3,75
Deviations
Error (table 11.8.1) U 196
12.668
66
1.056
0.06

The F-ratio for Deviations is very small, 0.06, giving no indication


that the regression is curved. The F for Regression. 3.75. is not quite
significant, this test being the same as the Hest for b.
The results of this experiment are probably disappointing. In trying
to discover the !;lest width of spacing. an investigator hopes to obtain a
316 Chapter II: Two-Way Clanillcalion.
curved regression, with reduced yields at the narrowest and widest spac-
ings, so that his range of spacings straddles the optimum. As it is, assum-
ing the linear regression real, the best spacing may lie helow 2 in. Methods
of dealing with curved regressions in the analysis of variance are given in
chapter 12.
Since the number of replications in the Latin square is equal to the
number of treatments, the experimenter is ordinarily limited to eight or ten
treatments if he uses this design. For four or less treatments, the degrees
of freedom for error are fewer than desirable, (a - I)(a - 2) = (3)(2) = 6
for the 4 x 4. This difficulty can he remedied by replicating the squares.
The relative efficiency of a Latin square experiment as compared to
complete randomization is
MR + Me + (a - I) ME
(a + liM.
Substituting the millet data:
. .
Relallve Efficiency = -,- =
sc,/ 3400 + 1536 + (5 - 1)(1056)
145%,
SL (5 + 1){1056}
a gain of 45% over complete randomization.
There may be some interest in knowing the relative efficiency as com-
pared to a randomized blocks experiment in which either rows or columns.
were lacking. In the millet experiment since the column mean square was
small (this may have been an accident of sampling), it might have been
omitted and the rows retained as blocks. The relative efficiency of the
_Latin square is
Mc + (a - I}M. = 1536 + (5 - 1)1056 = 109%
. aM. (5)(1056)

Kempthor~~ (4) reminds us that this may.not he a realistic com-


parison. For the blocks experiment the shape of the plots would presum-
ably have heen changed, improving the efficiency of that experiment. In
this millet experiment, appropriately shaped plots in randomized blocks
might well have compensated for the column control.
EXAMPLE 11.8.1~Here is a Latin square for easy computation. Treatments are
indicated by A, B, and C.
= = = = = j ' C = = = = = = = = =..~=~·
Columns
1------------------
Rows 2 3
- _ .. _ - - - j - - -
B, 23 c: 29
2 A 16 B: 16
3 C 24 A, 12

The mean squares are; rows, 2l, columns. 3; treatments, 93; remainder. 3.
317
EXAMPLEJ 1.8.2- - Fit the linear model for LatIn squares to the data of eumple
11.8.1. Verify the fitting by the relation. I:.DiJ.l; 2 = 6.
EXAMPLE 11.8.3---10 experiment~ affecting. the milk yield of dairy cows the great
variation among indIviduals requires large numbers of animals for evaluating moderate dif-
ferences. Efforts to apply several treatments successively to the same cow are complIcated
by the decreasing milk noV., by the shapes of the lactallon curves. by carry"()ver effects. and
by presumed correlation among the errors, [;,jl- The effort was made to canITol these diffi-
culties by the use of several pairs of orthogonal latin squares (9). the columm representing
COWs, the rows successive periods during lactation. the treatments being A = roughage.
B = limited grain. C = full grain.
For this example. a single square is presented. no effort being made to deal with carry-
over effects. The entries are pounds of milk for a 6-weck period. Compute the analysis of
variance.

~
Cow

Period 2 3
-------
I ,4: 608 s: 885 C: 940
II
III
I s:
C: 844
715 C: 1087
A: 711
,4: 766
s: 832
Source of Variation 1 Degrees of Freedom Sum of Squares Mean Square

Periods
---t-------
I 2 5.900 2.951)
Cows I 2 47.214 23,607
Treatments ! 2 103.436 51.718
Error 2 4.843 2.422

11.9-Missiog data. Accidents often result in the loss of data. Crops


may be destroyed. animals die. or errors made in the application of the
treatments or in reco(ding. Although the least squares procedure can be
applied to the data that are present, missing items destroy the symmetry
and simplicity of the analysis. The calculational methods that have been
presented cannot be used. Fortunately, the missing data can be estimated
by least squares and entered in the vacant cells of the table. Application
of the usual analysis of variance. with some modifications. then gives
results that are correct enough for practical purposes.
In these methods the missing items must not be due to failure of a
treatment. If a treatment has killed the plants. producing zero yield, this
should be entered as O. not as a missing value.
In a one-way classification (complete randomization) the effect of
missing values is merely to reduce the sample sizes in the affected classes.
The analysis is handled correctly by the methods for one-way classifica-
tions with unequal numbers (section! 0.11). No substitution orthe miss-
ing data is required.
In randomized blocks. a single missing value is estimated by the
formula (26)
318 Chapter II: Two-Way ClassiFications
where
a = number of treatments
b = number of blocks
T = sum of items with same treatment as missing item
B = sum of items in same block as missing item
S = sum of all observed items
As an example, table 11.9.1 showsthe yields in an experiment on four
strains of Gallipoli wheat, in which we have supposed that the yield for
strain D in block 1 is missing. We have
T = 112.6, B = 96.4. S = 627.1, a = 4, b = 5.
X _ 4(112.6) + (5)(96.4) - 627.1 _ 2
- (3)(4) - 5.4 pounds

TABLE 11.9.1
YIELDS OF FOUIl STRAINS OF WHI::AT IN FIVE RANDOMIZED Bl.OCKS
(PoUNDS Pat PLOT) WITH ONE MISSING VALUE

Block
Strain I 2 3 4 5 Total
-- .. --~~-~~. ._-
A 32.3 34.0 34.3 35.0 36.S 172.1
B 33.3 33.0 36.3 36.8 34.5 173.9 .
C 30.8 34.3 35.3 :12.3 35.& 16&.5
D 26.0 29.8 28.0 28.8 112.6

Total %.4 127.3 135.7 132.1 135.6 627.1

Analysis of Variance (With 25.4 Inserted)

Suurce of Variation Degrees of Freedom Sum of Squares Mean Squares

Btocks 4 35.39
Strains 3 171.36 57.12 (45.79)
Error II 17.33 1.58
.-----~---~

Total 18 224.08

This value is entered in the table as the yield of tbe missing plot. All
sums of squares in the analysis of variance are then computed as usual.
However. the degrees of freedom in the Total and Error S.S .. are both
reduced by I, since there are actually only 18 df for the Total S.S. and
II for Error.
This method gives the correclleast squares estimates of the treatment
means and of the Error mean square. For the comparison of treatment
means, the s.e. of the difference between Ihe mean with a missing value
and another treatment mean is not..J(2s'/b) but the larger quantity
319

J',[2b +
S
a
bib _ I)(a _ I)
] I
= ~(1.58) [2
5 + (5)(4)(3)
4] = ±O.859,

as against ±O,795 for a pair of treatments with no missing values.


The Treatments (Strains) mean square in the analysis of variance is
slightly inflated, The correction for this upward bias is to subtract from
the mean square
{B - (a - I)X}' = {96.4 - (3)(25.4j)' = 11.33
ala - I)' (4)(3)(3)
This gives 57.12 - 11.33 = 45.79 for the correCI mean square,
This analysis does nol in any sense recover the lost information, but
makes the best of what we have.
For the Latin square the formulas are:
X = [a(R + C + T) - 2SJI(a - 1)(0 - 2)
Deduction from Treatments mea" square for bias
= [s - R - C - (a - I)T)'/ia - Il'(a - 2)'
where a is the number of treatments, rows, or columns.
To illustrate, suppose that III example 11.8,3 the milk yield, 608
pounds, for Cow I in Period I was missing, Table 11.9,2 gives the result-
ing data and analysis. The correct Treatments mean square is (40, 408).
X = 3(1825 + I 559__-+:.2477)_.::-_2(678()) = 512 pounds
(2)(1 )
B' = (6780 - 1825 - 1559 - (2)(1477)]' = 24.420
las (2)(2}(2)O)( \)
TABLE 11.9.2
3x 3 LATIN SQUARE WITH ONE MISSING VALlIE

I Cow
Period ! 2 3 Tota} Treatments
--+- IIA 8885 C 9.40 1.825 A 1,477
I
II 8 '715 C 1,087 A 766 2,568 I B 2.432
III I C 84.( A 711 8 8)2 U 8 ± C2,871

To~1 b_'S_9_ _ _ _2_,683 2.5)8 6. 780


~-- -
6.780

Source of Variation I Dtgrees of Freedom Sum of Squares Mean Squares

Rows (Periods) I, 2
2
9.847
Columns (Cows) 68,185
Treatments I 2 129,655 64.828 (40.408)
Error I I 2,773 2.77)

Total 7 210,460
320 Cloapler II, Two-Way CICJllillcatio.s
Of course, no worthwhile conclusions are likely to flow from a single
3 x 3 square with a missing value, the Error M.S. having only I df The
s.e. of the difference between the treatment mean with the mis.ing value
and any other treatment mean i~

IS2{~ + __ I }
\J a la-l)(a~2)

Two or more missing data require more complicated methods. But


for a few missing values an iterative scheme may be used for estimation.
1'0 illustrate the iteration, the data in table 11.9.3 seem adequate.
Start by entering a reasonable value for one of the missing data, say
X,2 = 10.5. This could be X .. = 9.3. but both the block and treatment
means are above average, so 10.5 seems better From the formula. X 31 is

_ (3)(27) + (3)(21) - 75.5 = 17 1


X" - (3 _ 1)(3 - 1) .

Substituting X 31 = 17.1 in the table, try for a better estimate of X 22 by


using the formula for X 22 missing:

X" = (3)(23) + (3)(20) - 82.1 = 11.7


4

With this revised estimate of X 22 , re-estimate X'I:


X, = (3)(2'2._+ (3)(21) - 76.7 = 168
, 4 .
....
Finally, with this new value of X" in the table, calculate X 22 = 11.8. One
stops because with X22 = 11.8 no change occurs when X" is recalculated.
In the analysis of variance, subtract 2 df from the Total and Error
sums of squares. The Treatments 5.5. and M.S. are biased upwards. To
obtain the correct Treatments 5.5., reanalyze the data in table 11.9.3,
ignoring the treatments and the missing values, as a one-way classification
with unequal numbers, the blocks being the classes. The new Error
(Within blocks) 5.5. will be found to be 122.50 with 4 dJ Subtract from
this the Error 5.5. that you obtained in the randomized blocks analysis of
the completed data. This is 6.40, with 2 d.! The difference, 122.50 - 6.40
= 116.10, with 4 - 2 = 2 df, is the correct Treatments 5.5. The F ratio
is 58.05/3.20 = 18.1, with 2 and 2 df.
The same method applies to a Latin square with two missing values,
with repeated use of the formula for inserting a missing value in a Latin
square. Formulas needed for confidence limits and I-tests involving the
treatment means are given in (3), For experiI1}ents analyzed by electronic
computers. a general method of estimating missing values is presented in
(10)
321
TABLE 11.9.3
RANlX)MIZEJ) BLOCKS EXPER.IMENT Wll'H Two MISSING VA.LUl::::;

Blocks

Treatments 2 3 Sums
--~--~
~------

A 6 5 4 15
B 15 X" 8 23
C X" 15 12 27
Sums 21 20
-- ___-- 24 65

Il.IO-Non-cooI'ormity to llIOdeJ. In the standard analyses of vari-


ance the model specifies that the effects of the differenl fixed factors (treat-
ments, row, columns. etc.) are additive. and that the errors are normally
and independently distributed with the same variance. [t is unlikely that
these ideal conditions are ever exactly realized in practice. Much research
has been done 10 investigate the consequences of various types of failure
in the assumptions: for an excellent review. see (II). Minor failures do
not greatly disturb the conclusions drawn from the standard analySis. In
subsequent sections some advice is given on the detection and handling of
more serious failures. For thi~ discussion the types offaiJure are classified
into gross errors, Jack of independence of errors, unequal error variances
due to the nature of the treatments, non-normality of errors, and non-
additivity.
1l.II---(;ross errors: rejection of extreme observations. A measure-
ment may be read, recorded, or transcribed wrongly, or a mistake rna} be
made in the way in which the treatment was applied for this measurement.
A major error greatly distorts the mean of the treatment involved, and,
by inflating the error variance. affects conclusions about the other treat-
ments as well. The principal safeguards are vigilance in carrying out the
operating instructions for the experiment all..9 in the measuring and re-
cording process, and eye inspection of the data.
If a figure in Ihe dala to be analyzed looks suspicious, an inquiry
about this observation sometimes shows that there was a gross error and
may also reveal the correct value for this observation. (One should check
that the same source of error has not affected other observations also.)
With two-way and Latin square classifications, it is harder to spot an
unusual observation in the original data, because the expected value of
any observation depends on the row, column, and treatment effects. In-
stead, look at the residuals of the observations from their expected values.
In the two-way classification, the residual Dij is
Dij = Xij - Xj. - X'j + X ..
while in the Latin square,
Du, = X", - Xi" - x. j . - X .. , + 2X ...
322 Chapl.r II: Two-Way C/assillcations
If no explanation of an extreme residual that enables it to be corrected
is discovered, we may consider rejecting it and analyzing the data by the
method in section 11.9 for results with missing observations. The discus-
sion of rules for the rejection of observations began well over a century
ago in astronomy and geodesy. Most rules have been based on something
like a test of significance. The investigator computes the probability
that a residual as large as the suspect would occur by chance if there is no
gross error (taking account of the fact that the largest residual was
selected). If this probability is sufficiently small, the suspect is rejected.
Anscombe (12) points out that it may be wiser to think of a rejection
rule as analogous to an insurance policy on a house or an automobile.
We pay a premium to protect us against the possibility of damage. In
considering whether a proposed policy is attractive, we take into account
the size of the premium, our guesses as to the probability that damage will
occur, and our estimate of the amount of likely damage if there is a mishap.
A premium is involved in a rejection rule because any rule occa-
sionally rejects an observation that is not a gross error. When this
happens. the mean of the affected treatment is less accurately estimated
than if we had not applied the rule. If these erroneous rejections cause
the variances of the estimated treatment means to be increased by P%,
on the average over repeated applications, the rule is said to have a pre-
mium of P%.
Anscombe and Tukey (13) present a rule that rejects an observation
whose residual has the value d if Idl > es, where e is a constant to be
determined and s is the S.D. of the experimental errors (square root ofthe
Error or Residuals mean square). For any small value of P, say 2t~/,
or 5%, an approximate method of computing e is given (13). This
method applies to the one-way, two-way, and Latin square classifications,
as well as to other standard classifications with replication. The formula
for e involves lhe number of Error dj., say f, and the lotal number of
residuals, say N. In our notation the values of1 and N are as follows:
Classification
One-way (a classes, n per class). J=a(n-I): N=on
TI1Io-way (0 rows, b columns). I=(a-I)(b-I): N=ab
Latin square (a x a) J=(a-l)(a-2): N=a'
The formula has three steps:
I. Find the one-tailed normal deviate z corresponding to the proba-
bility IP/IOON, where P is the premium expressed in per cents.
2. Calculate K = 1.40 + 0.85z

3. e = K/S - K'41 2}JIN


In order to apply this rule, first analyze the data and obtain the values
of d and s. To illustrate, consider the randomized blocks wheat data
(table 11.9.1, p. 318) with a = 4, b = 5, that was used as an example of a
323
missing observation. This observation, for Strain D in Block I, was
actually pre.ent and had a value 29.3. In the analysis of the complete
data, this observation gave the largest residual, 2.3, of all N = 20 observa-
tions. For the complete data, s = 1.48 with/= 12. In a rejection rule
with a 2t~t. premium, would this observation be rejected 0
Since N = 20. we have /IN = 0.6, P = 2.5, so that /P/IOON = (0.6)
(0.025) = 0.015. From the normal table, this gives Z = 2.170. Thus,
K = 1.40 + (0.85)(2.170) = 3.24
C = 3.24 { I - 8.50}
48 ,)0.6 = 2.07

Since Cs = (2.07)(1.48) = 3.06, a residual of 2.3 does not call for rejection.
EXAMPLE 11.11.1--·10 the 5 )( 5 Latin sQuare on p. 3D. the largC'st residual from the
fitted model is + 55.0 for treatment E in row 5 and column 5. Would Ihis obst:rvation be
rejected in a policy with a 5° 0 premium '? ADS. No. Cs = 58.5.

11.12-Lack of independence in the errors. If care is not taken, an


experiment may be cond"uered in a way that ,induces positive correlations
between the errors for different replicates of the same treatment. In an
industrial experiment, all the replications of a given treatment might be
processed at the same time by the same technicians. in order to cut down
the chance of mistakes or to save money. Any differences that exist be-
tween the batches of raw materials used with different treatments or in
the working methods of the technicians may create positive correlations
within treatments,
In the simplest case these situations are represented mathematically
by supposing that there is an intraclass correlation Pr between any pair of
errors within ihe same treatment. In the absence of real treatment effects,
Ihe mean square between treatments is an unbiased estimate of
0" f I + (n - I )P/}' where n is Ihe number of replications, while the error
mean square is an unbiased estimate of 0"(1 - PI), as pointed out in sec-
tion 10.20. The F-ratio is an estimate of P + (n -'1)P/}/(i - PI)' With
PI positive, this ratio can be much larger than I ; for instance. with PI = 0.2
and n = 6, the ratio is 2.5. Thus, positive correlations among the errors
within a treatment vitiate the F-test, giving too many significant results.
The disturbance affects I-tests also, and may be major.
In more complex situations the consequences of correlations among
the errors have not been adequately studied, but there is reason to believe
that they can be serious. Such correlations often go unnoticed. because
their presence is difficult to detect by inspection of the data. The most
effective precaution is the skillful use of randomization (section 4.12). If
it is suspected that observations made within the same time period (e.g.,
morning or day) will be positively correlated, the order of processing oflhe
treatments within, a replication should be randomized. A systematic pat-
tern of errors, if detected, can sometimes be handled by constructing an

11
324 Chap'''' II: Two-Way C/aaHlcation.
appropriate model fort he statistical analysis. Forexamples. see (14). (15).
and (16).

1I.13-Unequal error variances due to treatments. Sometimes one or


more treatments have variances differing from the rest, although there is
no reason to suspect non-normality of errors. If the treatments consist
of different amounts of lime applied to acid soil. the smallest dressings
might give uniformly low yields with a small variance. while the highest
dressings. being large enough to overcome the acidity. give good yields
with a moderate variance. Intermediate dressings might give good yields
on some plots and poor yields on others. and thus show the highest vari-
ance. Another example OCCllrs in experiments in which the treatments
represent different measuring instruments. some highly precise and some
cruder and less expensive. The average readings given by different instru-
ments are being compared in order to check whether the inexpensive
instruments are biased. Here we would obviously expect the variance to
differ_ from instrument to instrument.
,When the error variance is heterogeneous in this way. the F-test
tends to give too many significant results. This disturbance is usually
only moderate if every treatment has the same number of replications (II).
Comparison of pairs or sub-groups of treatment means may, however. be
seriously affected~ since the usual estimate of error variance, which pools
the variance over all treatments, will give standard errors that are.too large
for some comparisons and t~o small for others.
For any comparison k;.jX j among the class means in a one-way classi-
fication.an unbiased estimate of its error varianj::e is V = LA;l Sj2 Inj. where
nj is the number of replications in Xi and s/ is the mean square within the
ith class. This result holds whether the (1.' are constant or not. If f;
denotes ;./ s;' In;. an approximate number of dj are assigned to V by the
rule (25):
t
df. = (1:,,;)' Il: r,' !(n; - I)}

When the n;are all equal. this becomes dl = (n - IHLr;)'ILr;'. For a


test of significance we take t = Li):;/ J V. with this number of df _
To obtain an unbiased estimate of the error variance of L = l:i.jX j ,
in a two-way classification. calculate thc_ comparison L j = l:;.jXij sepa-
rately in every block. (j = 1.2 .... h). The average of the h values L J is.
of course. L. "The siandard error of L is .j: l:(L j - L)' !h(h - I I}. with
(b - I) d.f,. which will be scanty if b is small.
If tlie trouble is caused by a few treatments whose means are sub-
stantially different from the rest, a satisfactory remedy is to omit these
treatments from the main analysis, since conclusions about them are clear
on inspection. With a one-way or two-way classification. the remaining
treatments are analyzed in the usual way. The analysis ofa Latin square
with one omitted treatment is described in (17). and with two omitted
treatments in (18).
325
1I.14-Non-normality. Variance-stabilizing transformations. In the
standard classifications. skewness in the distribution of errors tends to
produce too many significant results in F- and t-tests. In addition. there
is a loss of efficiency in the analysis. because when errors are non·normal
the mean of the observed values for a treatment is. in general. not the most
accurate estimate of the corresponding population mean for that treat-
ment. If the mathematical form of the frequency distribution of the errors
were known. a more efficient analysis could be developed. This approach
is seldom attempted in practice. probably because the exact distribution of
non-normal errors is rarely known and the more sophisticated analysis
would be complicated.
With data in which the effects of the fixed factors are modest. there is
some evidence that non-normality does not distort the condusions too
seriously. However, one feature of non-normal distributions is that the
variance is often related to the mean. In the Poisson distribution. the
variance equals the mean. For a binomial proportion with mean p. the
variance is p( I -.p)!». Thus. if treatment or replication effects are large.
we expect unequal variances, with consequences similar to tbose discussed
in the preceding section.
If ux' is a known function of the mean I' of X. say ux' = <1>(1'). a
transformation of the data that makes the variance almost independent
of the mean is obtained by an argument based on calculus. Let the trans-
formation be Y = fIX). and let .1'( X) denote the derivative of fiX) with'
respect to X. By a one-term Taylor expansion
Y ,;, /(1') + f'(1')(X - 1')
To this order of approximation. the mean value E(y) of Y is f(I') • .
since E(X - 1') = O. With the same approximation. the variance of Yls'
E{ Y - f(I')}' ,;, {j'(I');' E(X - 1')' = {j'(I'»)lqx ' = {f'(I')J't/>(I')

Hence. to make the variance of Y independent of 1'. we choose/(I')


so that the term on the extreme right above is a constant. This makes
.1\1') the indefinite integral of lill ,,;q,(P). Forthe Poisson distribution. tbis
gives f(11) = vI'. i.e., r = .JX. For the binomial. the method gives
r = arcsin vP. that is. )' is th< angle whose sine is ";p. When f(X) has
been chosen in this way. the value of the con~tant variance on the trans-
formed scale is obtained by finding {/'(P)}'q,(Il). For the Poisson. with
cP(lc) = I'. lip) = vI'. we have [(1') = 1/2";)1. so that U'()1),'q,(Il) =, {.
The vanance on the transformed scale is l.
JLIS-Square root transformation for counts. Counts of rare events.
such as numbers of defects or of accidents. tend to be distributed ap-
proximately in Poisson fashion. A transformation to V X is often e!Tee-
tive: the \'ariance,on the square root scale will be close to 0.25. If som~
counts are small. /)(-+ I or .jX + IX
+1. (19). >labilizes the variance
more effectively.
'26 CIIapIer II, Two-Way Clartilkations
TABLE 11.15.1
NUMBER Qf POppy PLANTS IN 0"TS
(Plants per 3 3/4 square feet)

Treatment

Block A B C D E

1 438 538 77 17 18
2 442 422 61 11 26
1 119 177 157 87 77
4 380 liS S2 16 20

Mea. 19S 413 87 18 lS


Ranae 121 223 lOS 71 S9

The square root transformation can also be used with counts in


which it appears that ·the variance of X is proporlional to the mean of X,
that is, "x' = kX. For a Poisson distribution of errors, k = I, but we
often lind k larger than I, indicating that the distribution of errors has a
variance greater than that of the Poisson.
An example is the record of popPy plants in oats (20) shown in table
11.15.1, where .the numbers are large. The differing ranges lead to a
suspicion of heterogeneous variance. If the error mean square were
calculated, it would be too large for testing differences among C, D, E
and too small for A and B;
In table 11.15.2 the square roots of the numbers are recorded and
analyzed. The ranges in the several treatments are now similar. That
there are differences ampng treatments is obvious; it is unnecessary to
compute F. The 5% LSD value is 3.09, suggesting that D and E are
superior to C, while, of course, the C, D, E group is much superior to A
and B in reducing the numbers of undesired poppies.
TABLE 11.15.2
SQUARE ROOTS Of THE Poppy NUMBERS IN TABLE 11.15.1

Block A B c- D E

I 20.9 23.2 8.8 4.1 4.2


2 21.0 20.5 7.8 5.6 5.1
J 17.9 19.4 12.5 9.3 8.8
4 19.5 17.7 7.2 4.0 4.5
-.------
19.8 20.2 9.1 5.8 5.6
"""'"
Ranle '3.1 5.5 5.3 5.3· 4.6

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Block, 1 22.65
Treatments '4 865.44 216.36
Error 12 48.69 4.06
327
The means in the square root scale are reconverted to the original
scale by squaring. This gives (19.8)2 = 392 plants for A; (20.2)2 = 408
plants for B; and so on. These values are slightly lower than the original
means, 395 for A, 413 for B,ete., because the meanofa set of square roots
is less than the square root of the original mean. As a rough correction
for this discrepancy, add ihe Error mean square in the square root
analysis to each reconverted mean. In this example we add 4.06, rounded
to 4, giving 396 for A, and so on.
A transformation like the square root affects both the shape of the
frequency distribution of the errors and the meaning of additivity. If
treatment and block effects are additive in the original scale, they will not
be additive in the square root scale, and vice versa. However, unless
treatment and block effects are both large, effects that are additive in one
scale will be approximately so in the other, since the square root trans-
formation is a mild one.
EXAMPl. f. II. I 5,1-The numbers of wireworm. counted in tae plots of B Latin square
(21) following ~iJ fumigations in the pre\tioUi year were:

Columns
R.... 2 3 4 S

I
2
3
P
M 6
0 4
3 0
K
M
0
2 N
0
S
6
I
K
N 4
6
I M
P
N

4
9 K P S
4 N17 P 8 M 8 0 9 K 0
S K 4 N 4 P 2 M 4 0 8

Since these arc such sman numbers, transfonn to .j(X + J). The fint number. 3. becomes
q(3 + I) = 2, etc.
Analyze the variance. Am. MeaD "luare for Tteatment1, t .4451; for ErrOl, 0.3259.
EXAMPLE II.IS.2-CaJcuJat< tile Studentized Ran,e D - 1.06 and show thai K ....
significantly fewer wireworms than M. N. and O.
EXAMPLE 11.l5.3-.fstimate the average numbers of wireworms per plot for me
several treatments. Ans. (with no bias correction) X. 0.99: M, 6.08; N. 6.40; O. 5.'.5:
P.4.38. To make the bias correction, add 0.33. giving J( ... 1.32: M'= 6.4l. etc.
EXAMPLE 11.1 S.4-lfthe error varhlOceof Xin the origillaJ scale is k times the mean
of X. and if effects ate additive in the square root scale, it can be sfiown that the true error
variance in the square root scale is approximately t, 4. Thus, the value of k can be estimated
from the analysis in the square root scale. If k is close to I. this suggestS that the distribution
of errors in the original scale may be close to the'Poisson distribution. In example 11.15.1.
k is about 4(0.3259)::::: 1.3. suggesting that most of the variance in the original scale is of the
Poisson type. With the poppy plants (table 11.15.2). k is about 16. indicating a variance
much greater than the Poisson.

1l.I6-Arcsin transformation for proportions. This transformation.


also called the angular transformation, was developed for binomial pro-
portions. If ai; successes out of n are obtained in the jth replicate of the
ilh treatment. Ihe proportion h = ai/" has variance Pij(l - pij)/n. By
means of table A 16. due to C. I. Bliss, we replace Pij by the angle whOse
328 ehapter II: Two-Way e,-H;calioft.
sine is ,j Pij' In the angular scale. proportions near 0 or 1 are spread out
so as to increase their variance. If all the error variance is hinomial, the
error variance in the angular scale is about 821jn. The transforrnation
does not remove inequalities in variance arising from differing values of
n. If the n's vary widely, a weighted analysis in the angular scale is
advisable.
With n < SO, a zero proportion should be counted as Ij4n before
transforming to angles, and a 100% proportion as (n - Ij4)jn. This
empirical device. suggested by Bartlett (22), improves the equality of
variance in the angles. A more accurate transformation for small n has
been tabulated by Mosteller and Youtz (19).
Angles may also be used with proportions that are subject to other
sources of variation in addition to the binomial, if it is thought that the
variance of Pij is some mUltiple of P;j(l - Po)' Since, however, this
product varies little for Pij lying between 30% and 70%, the angular trans-
forrnation is scarcely needed if nearly all the observed (iii lie in this range.
In fact, this transformation is unlikely to produce a noticeable change in
the conclusions unless the (iii range from near zero to 30% and beyond
(or from below 70% to 100%).
Table 11.16.1. taken from a larger randomized blocks experiment
(23), shows the percentages of unsalable ears of corn, the treatments being
a control. A. and three mechanical methods of protecting against damage

TABLE 11.16.1
PERCE"T AGE Of UNSALABLE EA.P.S. Of CoI.N

Treatments 1 1 3
Block
4 5 6
I
A 42.4 "' 34.3 24.1 39.5 55.S 49.1
B 33.3 33.3 5.0 26.3 30.2 28.6
C 8.5 21.9 6.2 16.0 13.5 15.4
D 16.6 19.3 16.6 2.1 11.1 11.1

Angle = Arcsin_ ,/Proportion Mean


.,
I.

A 40.6 35.8 29.4 38.9 48.2 44.5 39.6 40:6


B 35.2 35.2 12.9 30.9 33.3 32.3 29.9 24.9
C 17.0 27.9 14.4 :!3.6 21.6 23.1 21.) 13.2
D 24.0 26.1 _ 24.0 8.3 19.5 19.5 i 20.2 11.9

AnalySIS of Variance In Angles

Degrees ~f Freedom Sum of Squares Mean Square

Blocks 5 359.8
Treatments 3 1.458.5 486.2
Error 15 546.1 36.4

Total 1
--___j__
329
by corn earworm larvae. The value of II, about 36, was not constant. but
its variations were fairly small and are ignored. Note that the per cents
range from 2.1 ~iO to 55.5°;;).
In the analysis of variance of the angles (table 11.16.1). the Error
mean square was 36.4. Since 821/11 = 821/36 = 21.8. some variation in
excess of the binomial may be present. The F-value for treatments is
large. The 5"" LSD for comparing two treatments is 7.4. B. C. and D
were all superior 10 the control A, while C and D were superior to B. The
angle means are retranslated to per cents at the right of the table.
11.17-The logarithmic transformation. Logarithms are u,ed to stabi-
lize the variance if the standard deviation in the original scale varies di-
rectly as the mean; in other words, jfthecoefficient of variation is constant.
There are mathematical reasons why this type of relation between standard
deviation and mean is likely to be found when the effects are pruportiolUd
rather than llddilire: for example. when treatment 2 gives results con-
sistently 23~." higher than treatment I rather than results higher by. say.
18 units. In this situation the log transformation may bring about both
additivity of effects and equality of variance. If some 0 values of X occur.
log (X + I) is often used.
TABLE ILl7.1
ESTIMATED NU~1BI:RS Of- FOUR KI~us or PlANKTOI'O (I. ... IV) CAt.:GIIT 1:--; SIX IIAuI.s .
WITH EACH 01-' Two Nns
Estimated Numbers Logarilhms

Haul I II III IV I II III IV


-
t 895 1.520 43.)00 IIJMJO 2.95 3.IS 4.64 4.04
2 540 1.610 3~)IO(l H.600 2.73 3.21 4.5~ 3.93
3 1,020 1.900 :2~,800 g,260 ~.Ol 3.2)\ 4.46 3.92
4 470 1.350 34.600 9.830 ~.67 1.1.1 4.54 J.9<J
5 428 9~O 27,800 7,6()(1 ~.h3 2.99 H4 3.88
6 620 1.710 l2.800 9.650 2.79 3.2l 4.52 ~.q&
7 760 1.930 2H.IOO 8.900 2.XIl 3.29 4.45 3.95
8 537 1.960 18.900 6.060 2,73 3.29 4 ..2~ 3.78
9 845 U~*(J .11.400 10200 2.93 3.26 4.50 4.01
10 1,050 2,410 39.500 IS.SOO J.U2 3..1R 4.60 4.19
II 387 1.520 :29,000 \}.2S0 ~.59 1.18 4.4(1 197
12 497 1.685 22.300 7,900 2.70 3.2.1 4.35 3.90

Mean 671 1.701 JO.775 9.396 2.802 3.221 4.4l\() 3.%2


Range 663 1.480 24,4(}0 9.440 0.43 0.39 0.36 0.41

Anatysts ot Variance 01 Loganlhms

Source of Variation Degrees of Freedom Sum of Square) Mean Square

Kind of plankton 3 20.2070 6.7357


Haul II 0.3387 0.0)08
DiscrepancC' 33 0.2300 0.0070
330 CIoapIcw II, T_Way CloosiIIcatioM
The plankton catches (24) of table ILl7.1 yielded nicely to the log
transformation. The original ranges and means for the four kinds of
plankton were nearly proportional, the ratios of range to mean being
0.99,0.87,0.79, and 1.00. After transformation the ranges were almost
equal and uncorrelated with the means.
Transforn ing back, the estimated mean numbers caught for the four
kinds of plankton are antilog 2.802 = 634; 1,663; 30,200; and 9,162.
These are geometric means.
The means of the logs will be found to differ significantly for all
fOllr kinds of plankton. The standard deviation of the logarithms is
../0.0070 = 0.084, and the antilogarithm of this number is 1.21. Quot-
ing Winsor and Clark (page 5), "Now a deviation of 0.084 in the
logarithms of the catch means that the catch has been multiplied (or
divided) by 1.21. Hence we may say that one standard deviation in the
logarithm corresponds to a percentage standard deviation, or coefficient
Of variation, of 21% in the catch."
EXAMPLE JJ.17.I-Thc: following data were abstracted rrom an experiment (27)
which was more: complicated in design. Each entry ill the geom'etric mean of insect catches
by a trap in three'suc:cessive nights, one night at each of three locatiom. Three types of
trap are compared over five ,three-night periods. 'fhto itllC'lCU are macrolepidopt~ at
Rotbamstcd Experimental Station.

3-Nighl Periods. Au.,..., I~

Trap l6-t8 19-21 22-24 25-27 211-30

I 19.1 23.4 29.S 23.4 16.6


2 SO. I 166.1 223.9 58.9 64.6
3 123.0 407.4 398.1 229.1 251.2

WiUiams found the log trlnsformation effective in analyzinc bighly variable data like
these. Transform to logarithms and analyze their variallCe. AnI. MeaD square for traps
= i.44SS; for, error, 0.0172.
Show that all differences between trap means Bre significant aDd that the geometric
means for traps are 21.9,93,3, and 257,0 jnsects.

1l.18-Non-additivity. Suppose that in a two-way classification, with


2 rows and 2 columns, the effects of rows and columns are proportional
or multiplicative instead of additive. In each row, column B exceeds
column A by a fixed percentage, while in each column, row 2 exceeds row
I by a fixed percentage. Consider column percentages of 20"1. and 100%
and row percentages of 10% and 50"/.. These together provide four com-
binations. taking the observation in column A, row 1. as 1.0, the other
observations are shown in table 11. J8.1 for the four cases.
Thus, in case I, the value of 1.32 for B in row 2 is l.l x 1.2. Since
no experimental error has been added, the error mean square in a correct
analysis should be zerO. The correct procedure is to transform the data
to logs before analysis. In logs the effects become additive, and the error
mean square is ·zero. From the analysis in logs, we learn that B exceeds
A by exactly 20"1. in callClll 1 and 2, and by exactly 100% in cases 3 and 4.
331
TABLE 11.18.1
HYf'OTHETICAl. D.HA FOR FOlTR CASES WITH MULTIPLICATIVE EFfECl'S

Case I Case 2 Case 3 Case 4


C 20'\, C 2~{, C IOO~/o C l000~
-_ j--.
R lOo~ R S~,~ R 1O~~ R 50'1.

Row A B A B A B A B
-
I 1.0 1.2 1.0 1.2 1.0 2.0 1.0 2.0
2 I.J 1.32 1.5 1.8 1.1 2.2 1.5 3.0

,Means 1.05
0.01
1.26 1.25
0.05
1.50 1.05
0.05
2.10 1.25
0.25
2.50

'IX O.9,?~ 3.6% 3.2% 13.3,?~

If the usual analysis of variance is carried out in the original scale. the
standard ecror, per observation (with I df.) is shown under each case.
With 2 replications. 5 is also the 5.e. of the difference II _ A. Conse-
quently. in case I we would conclude from this analysis that H - A is o.il
with a standard error of ±0.01. In case 4 we conclude that S - A
= 1.25 ± 0.25. The standard errors. ±0.01 and ±0.25. are entirely a
result of the fact that we used the wrong model for analysis. In a real
situation where experimental errors are also present. this variance 5' due
to non-additivity is added to the ordinary experimental error variance (1'.
To generalize. the analysis in the original scale has two defects. It
fails to discover the simple proportional nature of the relationship be-
tween row and column effects. It also suffers a loss of precision. since the
error variance is inflated by the component due to non-additivity. If
row and column effects are both small. these deficiencies are usually not
serious. In case I. for example, the standard error s due to non-additivity
only is 0.9,%, of the mean. If the ordinary standard error" were 5~~ ofthe
mean (a low value for most data). the non-additivity would increase this
only to J25.81 or 5.1%. The loss of precision from non-additivity is
greater in cases 2 and 3 and jurnps markedly in case 4 in which both row
and col~mn effects are large.

11.19-Tukey's test ofadditivity. This is useful in a variety of ways:


(il to help decide if a transformation is necessary: (ii) to suggest a suitable
transformation: (iii) to learn if a transformation has been successful in
producing additivity (28. 29).
The lest is related to transformations of the form Y = x". in which
X is the original scale. and we are seeking a power p of X such that effects
are additive in the scale of Y ~ XP. Thus. p = 1/2 represents the square
foot transformation and p = -- J a }'£'l'iprocal transformation. analyzing
IIX instead of X. The value p = 0 is interpreted as a log transformation.
I,,:cause the variable XP behaves like log X when p is small.
The rationale of the test can be indicated by means of calculus. For
332 Chaple. II: Two-Way CI_ilicatio..
the two-way classification. if effects are exactly additive in the scale of Y.
we have,
Y,j = Y.. + Cr.. - Y.. J + (Y.) - Y..J
= f..[1 + {(Y;. - Y.) + (y. j - Y..lI/Y..]
We suppose that row and column effects are small relative to the
mean. This implies that ~i = (y,. - f..)if. and fJ j = (y. j - Y.lIY. are
both small.
Write Xij = Y,/i P and expand in the usual Taylor's series. This gives
Xij = 1':.IIP[1 +~, + pj]I!P
-Y lip [ I 1 (l - pi 1 2
I + P(~i + fJ,) + p - p - :2 (.xi + 2~'Pj + Pj ) + ...
2 ]
=..
Now, in the X scale the terms in ~"~,' represent row effects and the terms
in Pj' p/ represent column effects that are added together in the above
expression. These terms are therefore still additive in the X scale. The
first non-additive term is the one in a,p j • Written in full, this term is
y.uPO - p)(y,. - Y.Hr. j - f.·I/p'Y.' (11.19.1)
For our purpose we need to write this expression in terms of X rather
than Y. By new single-term Taylor expansions we have. since Y = xp
Y,.- Y.%pKFI(X,.-X .. ): Yj - Y.=pK._P'I(X.!-X .. I
Substitution into (l1.l9.1) gives for the first non-additive term in Xii,
(I - plY' PIX,. - X,,)(X' j - X.. )X ..'p-"y..'
Using Y. "" X.. p. this term may be expressed approximately as

-x::- (X,. -
(I - p) _
X.. HX'j - X.. \ " 1.19.21
Since this term represents a non-additive effect of rows and columns. it
will appear in the residual or Xij when an additive model is titted in the .\'
scale. The conclusions from this rough argument are as follows:
I. (I' this type of non-additivity is present in X. and Xi) is the fitted
value given by the additive model. the residual X'j - Xi) has a linear re-
gression on the variate (Xi' - X .. )(X. j - X .. ). _
2. The regression coefficient B is an estimate of (I - pi/X ... Thus.
the power p to which X must be raised to produce additivity is estimated
by (I - BX .. ). Commenting on this result, Anscombe and Tukey (13)
state (their k is our B(2): .. It is important to emphasize that the available
data rarely define the 'correct' value of p with any precision. Repeating
the analysis and calculation of k for each of a number of values of p may
show the range of values of p clearly compatible with the observations, but
experience and subject-matter insight are important in choosing a p for
final analysis."
333
3. Tukey's test is a test of the null hypothesis that the population
value of B is zero. A convenient way of computing B and making the
test is illustrated by the data in table 11.19.1. The data are average insect
catches in three traps over five periods. The same data were presented
in example 11.l7.1 as an exercise on the log transformation. We now
consider the additivity of trap and period effects in the original scale. The
steps are as follows (see table 11.19.1 for calculations);
TABLE 11.19.1
MACROLEPIOOP'fERA C.UCHES BY THREE TRAPS 1'" FIVE PERIODS
(Calculations for test of additivity)

Trap Sum Mean


Period 2 3 Xj' ~j' d, Wj = l:.X1r'i
- - - - - - - -..
I 19.1 SO.1 123.0 192.2 64.1 -74.9 14.025
2 23.4 166.1 407.4 596.9 199.0 +60.0 5].096
3 29.5 223.9 398.1 651.5 217.2 +78.2 47.543
4 23.4 58.9 229.1 311.4 103.8 -35.2 28.444
5 16.6 64.6 251.2 332.4 1I0.S -28.1 32.243.

Sum X'j 112.0 563.6 1408.8 2084.4 III'; = 173.351


Mean X'I 22.4 112.7 281.8 139.0
d, -116.6 -26.2 + 142.8 0.0

(i) Find d; = XI_ -X .. and d) =X' j -X .. , both adding exactly to zero.


(ii) = (19.1)( - 116.6) + (50.1)( - 26.2) + (123.0)( + 14~.~) = 14.0~5
11'1

,r,~ (16.6)( -116.6) + (64.6)( -26.2) + (251.2)( + 142.8) ~ 31.243


Check: 173.351 ~ (112.0)( -116.6) + (563.6)( -26.2) + (1408.8)( + 142.8).
IV ='I:.w,ui =(l4.Q25)(-74.9)+ ", +(32.24J)(-2KH=3.8259xJO fl
(iii) Id/ = (-'74.9,!+ ... +(_28.1)2 =17.354
r.d/ = (-116.6)1 + ... + (+ 142.8)2 ~ 34.674
D ~ (td.' )(td/) ~ (l7.JS4)(J4.674) ~ 601.7 x 10·

. . .. .\'l (3.8259)20012)
(IV) S.S. (or nOfl-addf[IV'[) = 75 = (601.7)(101i) 14.317

(i) Calculate d, = X,. - X.. and dj = X. i-X ... rounding if neces-


sary so that both sets add exactly to zero.
(ii) Compute w, = I:X,jdj and record them in the extreme right
column; Then find

N= 2:, w,d, = I:I: X'jd,d j


N is the numerat,lr of B.
(iii) Tho denominator D of IJ is (I:d.')(LI/). Thus. B = N·D.
(iv) The contribution of non-additivity to tht: error sum of squares
or Xis N'ID, with I elf This is tested by an F-tcst ag.inst the remainder
334 a...,.t.r II: Two-Way Clauillcalialu
TABLE 11.19.2
ANAL YSfS OF VARIANCE AND TEST Of ADDITIVITY
======,===============-
Degrees of Freedom Sum of Squares Mean Square
------~r------------
Periods 4 52,066
Traps 2 173,333
Error 8 30.607
------~r----------------------------------
Non-additivity 1 24,327 24,327
Remainder 7 6,280 897

F- 24,327/897 z 27.1, d,f. - I. 7. P < om

of the Error 5.5., which has {(T - J)(c - I) - I} dJ. The test is made in
table 11.19.2.
The hypothesis of additivity is untenable. What type of transforma-
tion is suggested by this test?
N 3.8259
B = D = -60 l. 7 = 0.006358
P= I - BL = I - (0.006358)(139.0).= 1 - 0.88 = 0.12,
The test suggests a one-tenth power of X. This behaves very much
like log X.
H.20-Non-addltivity In a Latin square. If the mathematical analysis
of the previous section is carried out for a Latin square, the first non-addi-
tive term, corresponding to equation 11.19.2, is, as might be guessed.

~
"'1'-'-:-.-- 'X
(I - p)
,( i " 'v
-
-
A ... )(X. j . -
"
" ... )
-
+ (Xi" - v
- X ... )(X .. , - " ... )

+ (X'j' - X... ilK.. , - L.)l


Conseqllcn!ly, Ihe lesl for addilivilY is carried oul by linding Ihe regression
of (Xi.i4 - XU.,) on the variate in { ;- above. as illustrated in (28). Note
that D is the ('ff'Or surn of $quares of the { } variable.
We shall. instead. illustrate an alternative method of doing the com-
pUlalions. due to Tukey (29). Ihat generalizes to other classifications.
Table 11.20.1 comes from an experiment on monkeys (30), the raw data
being the number of responses to auditory or visual stimuli administered
under tive conditions (A . ... 1:.1. Each pilir of monkeys received one Iype
of stimulus per week. the order from week 10 week being determined by
the randomized columns of the Latin square.
It was discovered that the standard deviation of the number of
responses was almost directly proportional to the mean, so the counts were
transformed to logs. Each entry in Ihe table is Ihe mean of the log counts
for tbe two members of a pair. Has additivily been attained?
335
TABLE 11.20.1
LoGs OF NUMBERS Of RESPONSES BY PAIRS Of MONKEYS UNDER FIVE STJWUU
(Test of additivi.ty in a Latin square)

Wee.
Pair 1 2 3 4 5 X",
1 B 1.99 D 2.25 C 2.IS A 2.18 E 2.51 2.222 .
gill 2.022 2.268 2.220 2.084 2.518

d jjl
--
-0.032
--
-0.018
--
-0.040
--
0.098·
--
-0.008
U"' 37 3 a 17 92

2 D 2.00 B 1.85 A 1.79 E 2.14 C 2.31 2.018


1.9SG 1.932 1.852 2.152 2.206
--
0.052"
--
-0.082
--
-0.062
--
-0.012
~

0.104
70 80 132 4 0

3 C 2.17 A 2.10 E 2.34 B 2.20 D 2.40 2.242


2.132 2.082 2.348 2.178 2.472
--
0,038
--
0.018
--
-0.006·
--
0.022
--
-0.072
7 18 18 1 66

4 E 2.41 C 2.47 B 2.44 D 2.53 A 2.44 2.458


2.456 2.462 2.366 2.526 2.482
-- -- -- -- --
-0.046 0.010" 0.074 0.004 -0.042
58 61 23 97 71

5 A 1.85 E 2.32 D 2.21 C 2.05 B 2.25 2.136


1.862 2.248 2.176 2.162 2.234
-- -- -- -- --
-0.012 0.072 0.034 -0.112 O.oJS"
125 1 2 3 a
X.J• 2.084 2.198 2.192 2.220 2.382 2.215
"

A B C D E
X ..\ 2.072 2.146 2.236 1.278 2.344

• Denotes deviations that 'were adjusted in order to make the deviations add to zero
over r:vcry row, column. and tttatment.

The steps follow.


I. Find the row. column. and treatment means. as shown; and the
fitted values k ij, by the additive model
gij, = Xi" +X. j • + X .. , - 2X ...
For E in row 2. column 4.
Kw = 2mS + 2.220 + 2.344 - 2(2.215) = 2.152
2. Find the residuals d u' = XiI' - g,1' as shown. adjusting if neees-
336 C"","ler II, Two-Way CI-nkations
sary so that the sums are zero over every row, column, and treatment.
Values that were adjusted are denoted by an • in table 11.20.1.
3. Construct the 25 values of a variate ViJl< = e,(X iJII - e,)', where
e, and e 2 are any two convenient constants. We took e 2 = X ... = 2.215,
which is often suitable, and c, = 1000, so that the V's are mostly be-
tween 0 and 100. For Bin row I,column I,

V 11 , = 1000(2.022 - 2.215)' = 37

4. Calculate the regression coefficient of the diJl< on the residuals of


ihe Vi"" The numerator is

N = -r.dijtVij, = (-0.032)(37) + ... + (0.018)(0) = -20.356

The denominator D is the error sum of squares of the V ij,. This is found
by performing the ordinary Latin square analysis of the V ij,. The value
of D is 22,330.
5. To perform the test for additivity. find the S.S .• 0.0731. of the
d i , •• which equals the error S.S. of the X iji . The contribution due to non-
additivity is N'/D = (-20.356)2/22.330 = 0.0186. Finally. compare the
mean square for Non-additivity with the Remainder mean square.

Degrees of Freedom Sum of Squares Mean Square F

Error S.S. 12 0.0731


Non-additivity I 0.0186 0.01"6 3.76 (P ~ 0.08)

Remaonder
,
I II 0.0545 0.00495

The ""Iue of P is 0.08---a little low. though short of the 5% level.


Since the interpretations are not critical (examples 11.20.4. 11.20.5). Ihe
presence of slight non-additivity should not atfectthem.
The above procedure applies also in more complex classifications.
Note that if we expand the quadratic e,(X iPi - X ... f. the coefficient of
terms like (Xi" - X ... j(X. j . - X ... ) is 2£',. Hence the regression co-
efficient B of the previous section is B = 2c,N/D. If a power transforma-
tion is needed, the suggested power is as before p = I - OX ....
EXAMPLE 11.20.1-- The following data afe the number of lesions on eight pairs of
half leaves inoculated with two strengths of tobacco virus (from table 4.3.J).
~------
--- ~--------

Replications
Treatments 2 3 4 5 6 7 8

31 20 18 17 9 8 10 7
2 18 17 14 II 10 7 5 6
337
Test for additivity by the method of section 11.19. Ans.;

Degrees of Fr«dom Sum of Squares Mean Square F

Error 7 65
Non-addhivity 1 38 38 8.4

Remainder 6 27 4.5

Fis significant at the 5% level. The non-additivity may be due to anomalous behavior
of the 31.18 pair.
EXAMPLE 11.2D.2-Apply .j(X + J) to tbe virus data. While F now becomes non-
significant. the pair (31. 18) stin appears unusual.
EXAMPLE 11.20.3.-Thedata in example 11.2.1, regarded as a 3 x 3 two-way classifi-
cation, provide another simple exathple of Tukey's ~st. Ans. For non-additivity, F = 5.66.
EXAMPLE 1 i.20.4--Analyze the variance of the logarithms of the monkey responses.
You will get,

Degrees of freedom Sum of Squares Mean Square F

Monkey Pairs 4 0.5244 0.1311


Weeks 4 0.2294 -0.0574
Stimul,i 4 0.2313 0.0578 9.6

Error 12 0.0725 0.00604

EXAMPLE J 1.20.S-Test aU differences among the means in table 1J.20.1. using the
LSD method. Ans. E:> A, B. C; lJ > A. B; C> A.
EXAMPLE 11.20.6---Calculate the sum of squares due to the regression of log re·
sponse on weeks. It IS convenient to code the weeks asX := -2. -1.0,1,2. Then. taking
the weekly means as Y, Ixy = 0.618 and (l:xy)l~Xl "'" 0.03819. On the per item basis,
the sum of squares due to regression is 5(0.03819) = 0.19'l0. The line ror Weeks in example
11.20.4 may now be separated into two pans:

0.1910
0.0128

Comparing the mean squares with enor, it is seen that dtviaticns are not significant, most
of the sum of squares for Weeks being due to the regression.

REFERENCES
L R. H. PoRTER. Cooperative Soybean Seed! Treatment Trill/s. Iowa State College Seed
Laboratory (1936).
2. S. P. MONSELISE. Palestine J. Botany, 8: I (19:51).
3. W. G. CocHRAN and G. M. Cox. Experimental DeSigns. 2nd ed., Wiley, New York
(1957).
4. O. KEMPTHORNE. Design and Analysis of Experiments. WHey, New York (19:52).
5. R. A. FISHER. The Design of experiments. Oliver and Boyd, Edinburgh(1935-19SI).
6. H. C. FORSTE1I. and A. J. VASE"{. J. Dept. of Agric., Victoria, Australia, 30: 35 (1932).
~. R. A. FlsnER·.and F. Y A.lts. Statisrical Tables. Oliver and Boyd. Edinburgh (193&-
1953).
&. H. W. ll.C.J MENG. and T. N.llU. J. Amer. SOl'. Agmn., 28:1 (1936).
338 CItaptor II: Two-Way Clcmillcafions
Y. W. a. COCHRAN. K. M. AlTT'REl', and C. Y. CANNON. J. Dairy Sci .• 24:937 (J94J).
10. M. HEALY and M. WESTMACOTT. Applied Slalis/ics, 5:203 (J956).
11. H. SCHEFFE. The Analysis of Variance, Wiley, New York (1959),
12. F. J. ANSCOMBI2. Tuhnometrics, 2:123 (1960).
13. F. J. ANSCOM-BE and 1. W. TUKEY. Technometrics. 5:141 (1963).
14. W. G. COCHRAN. Biometrics, 3:33 ((947),
t." W. T. FEDERER and C. S. ScHLOTTFELDT. Biometrics, W:282-9O (1954).
16. A. D. OUTHWAITE and A. RUTHEllFORD Biomelric3, J I :431 (1955).
17. F. YATES. J. Agrie. Sei., 26:301 (1936).
18. F. YATES and R. W. HALE. J. R. Statist. Soc. Suppl.• 6:67 (1939).
19. F. MOSTELLER and C. YOUTZ. Biometrika, 48:433 (1961).
ZO. M. S. BARTLETt. J. R. Statist. Soc. Suppl., 3:68 (1936).
21. W.G.CocHRAN. £mp.J.£xp.Agric.,6:157(1938).
22. M. S. BARTLETT. Biometrics, 3: 39 (1947).
23. W. G.COCHRAN. Ann. Math. Statist., 11:344(1940),
24, C. P. WINSOR and G. L. CLARKE. Sears Foundation: J. Ma,.inl' Res .. 3:1 (1940).
25, F. E. SATTERTHWAITE. Biometrics Bull., 2: 110 (1946).
26. F. YATES. Emp. Jour. Exp. Agrjc., 1: 129 (1933).
27. C. B. WCLLlAMS. Bul. Entomological Res., 42:513 (1951).
28. J. W. TUKEY. BiDmetric.1, 5:232 (J949).
29. J. W. Tu·KEY. Querie$in Biomerrics, 11:111 (1955).
30. R. A. Bun.EIl. J. Exp. Psych .• 48:19 (1954) ..
* CHAPTER TWELVE

Rctorial expenments
12.1-Introduction. A common problem in research is investigating
the effects of each of a number of variables, or faclors as they are called,
on some response Y. Suppose a company in the food industry proposes
to market a cake mix from which the housewife can make a cake by adding
water and then baking. The company must decide on the best kind of
flour and the correct amounts of fat, sugar, liquid (milk or water), eggs,
baking powder, and flavoring, as well as on the best oven temperature and
the proper baking time. These are nine factors, anyone of which may
affect the palatability and the keeping quality of the cake to a noticeable
degree. Similarly; a research program designed to learn how to increase
the yields of the principal cereal crop in a country is likely to try to measure
the effects on yield of different amounts of nitrogen, phosphorus, and
potassium whon added as fertilizers to the soil. Problems of this type
occur frequently in industry: with complex chemical processes there can
be as many as 10 to 20 factors that may affect the final product.
In earlier times the advice was sometimes given to study one factor
at a time, a separate experiment being devoted to each factor. Later,
Fisher (I) pointed out that important advantages are gained by combin-
ing the study of several factors in the samejac/orial experiment. Factorial
experimentation is highly efficient, because every observation supplIes
information about all the factors included in the experiment. Secondly,
as we will see, factorial experimentation is a workmanlike method of in-
vestigating the relationships between the effects of different factors.
12.2-The single factor versus the factorial approach. To illustrate
the difference between the "one factor at a time" approach and the fac-
torial approach, consider an investigator who has two factors, A and B.
to study. For simplicity, suppose that only two levels of each factor, say
aI' a1.' and hI' b2 are to be compared. In a cake mix, a!, a2 might be two
types of flour and ht, b 2 two amounts of flavoring. Four replications are
considered sufficient by the investigator.
In the single-factor approach, lhe first experiment is ~ comparison
of a, with a,. The level of B is kept constant in the first exPeriment, but
339
lI2
340 Chap,... 12: Facloriol Experimenls
the investigator must decide what this constant level is to be. We shall
suppose that B is kept at b l : the choice made does not affect our argument.
The two treatments in the first experiment may be denoted by the symbols
alb l and a,b/:,' repli'bated ~.our times. The effect of A, that is, the mean
difference Q2 1 - 0 1 l' is estimated with a variance 20'2/4 = (12/2.
The second experiment compares b, with b l . If a, performed better
than a l in the first experiment, the investigator is likely to use a, as the
constant level of A in the second experiment (again, this choice is not vital
to the argument). Thus, the second experiment compares u,h I with
a2b2 in four replications, and estimates the effect of B with variance 0'2/2.
In the two single-factor experiments, 16 observations have been made,
and the effects of A and B have each been estimated with variance (J2/2.
But suppose that someone else, interested in these factors, hears that
experiments on.them have been done. He asks the investigator: In my
work, I have to keep A at its lower level, a I' What effect does B have
when A is at al? Obviously. the investigator cannot answer this question,
since he measured the effect of B only when A was held at its higher level.
Another persoo might ask: Is the effect of A the same at the two levels of
B? Once again, the investigator has no answer, since A was tested at only
one level of B.
In the factorial experiment:the investigator compares all treatments
that can be formed by combining the levels of the different factors. There
are four such treatment combinations, alb l , alb!, G1h z, azh z. Notice
that each replication of this experiment supplies two estimates of the
effect of A. The comparison a,b, - alb, estimates the effect of A when
B is held constant at its higher level, while the comparison a,b l - alb l
estimates the effect of A when B is held const'lnt at its lower level. The
average of the"" two e~t\mate~ i~ <:alled the main .ff<el of A, the adje<:tive
main being a reminder that this is an average taken over the levels of the
other factor. In terms of our definition of a comparison (section 10.7)
the main effect of A may be expressed as
(12.2.1 )
where (a,b,) denotes the yield given by the treatment combination a,b,
(or the average yield if the experiment has r replications), and so on. By
Rule 10.7.1 the variance of LA is
2 ,
~ {(t)' + (W + (!)' + (W) = ~
r . r
If the investigator useS 2 replications (8 observations), the main effect of A
is estimated with a variance ,,'/2.
Now consider B. Each replication furnishes two estimates.
a,b, - a,b" and alb, - alb l , of the effect of B. The main effect of B is
the comparison
(12.2.2)
3..'
With two replications of the factOrial experiment (8 observations), L B , like
L .., has variance a 2 /2.
Thus, the factorial experiment requires only 8 observations, as against
16 by the single-factor approach, to estimate the effects of A and B with
the same variance (1'/2. With 3 factors, the factorial experiment requires
only 1/3 as many observations, with 4 factors only 1/4, and so on. These
striking gains in efficiency occur because every observation, like (a,b,),
or (a,b,c,), or (a,b,c,d,), is used in the estimate of the effect of every
factor. In the single-factor approach, on the other hand, an observation
supplies information only about the effect of one factor.
What about the relationship between the effects of the factors? The
factorial -experiment provides a separate estimate of the effects of A at
each level of B, though these estimates are less precise than the main
effect of A, their variance being (1'. The question: Is the effect of A the
same at the two levels of B?, can be examined by means of the com-
parison:
{(a,b,) - (a,b,)) - {(a,b,) - (a,b,)} (12.2.3)
This expression measures the difference between the effect of A when
B is at its higher level and the effect of A when B is at its lower level. If
the question is: Does the level of A influence the effect of B?, the relevant
comparison is -
(12.2.4)

Notice that (12.2.3) and (12.2.4) are identical. The expression is called
the AB two-factor interaction. In this, the combinations (a,b,) and (a,b,)
receive a + sign, the combinations (a,b,) and (a,b,) a - sign.
Because of its efficiency and comprehensiveness, factorial experi-
mem.ati.on l~ e ...tensi"el~ u.sed to. research 9rQ\!.I~ 9afticu.l.arl~ in tn-
dustry. One limitation is that a factorial experiment is usually larger and
more comrlex than a single-factor experiment. The potentialities of fac-
torial experimentation in clinical medicine have not been fully exploited,
because it is usually difficult to find enough suitable patients to compare
more than two or three treatment combinations. ..... ~
In analyzing the results of a 2' factorial, the commonest procedure is
to look first at the two main effects and the two-factor interaction. Ifthe
interaction seems absent, we need only report the main effects, with some
assurance that each effect holds at either level of the other variate. A
more compact notation for describing the treatment combinations is also
standard. The presence of a letter a or b denotes one level of the factor
il) question, while the absence of the letter denotes the other level. Thus,
a,b, becomes ab, and a,h, becomes b. The combination a,b, is denoted
by the symbol (I). In this notation, table 12.2.1 shows how to compute
the main effects and the interaction from the treatment totals over r
replications.
342 Chapter 12: F""fori,,' Exp.rimen"
rABLE 12.2.1
CALCULATION OF MAIN EFFECTS AND INTERACTION IN A 21 FACTORIAL

Multiplier for Divisor Contribution


Factorial Treatment Total to give to
Effect (1) a b ab Mean Treatments S.S.

A -1 1 -1 2r [AI'/4r
B -1 -1 1 2r [BI'/4r
A.B 1 -1 -1 2r [A.BI'/4r

Thus, the main effect of A is:


[AI/2r = [(ab) - (b) + (a) - (I)1I2r
The quantities [A I, [BI, [AB] are called factorial effect totals. Use of
the same divisor, 2r, for the AB interaction mean is a common convention.
In the ana,lysis of variance, the contribution of the main etIect of A
to the Treatments S.S. is [A ]'/4r, by Rule 11.6.1. Further, note that the
three comparisons [AI~ [B] and [AB] in table 12.2.1 are orthogonal. By
Rule 11.6.4, the three contributions in the right-hand column of table
12.2.1 add up to the Treatments 5.5.
EXAMPLE 12.2.1 ~ Yates (2) pointed out that the concept offactorial experimentation
can be applied to gain accuracy when weighing objects on a balance with two pans. Suppose
that two objects 'are to be weighed and that in any weighing the balance has an error dis-
tributed about 0 with variance (11 If the,two objects are weighed separately, the balance
estimates each weight with variance (12. fnstead, both objc;:cts are placed in one pan, giving
an estimate y\ of the sum of the weights. Tht!n the objects are placed in different pans, '
giving an estimate Y1 of the difference between the weights. Show that the quantities
(YI + Y2)/2 and (Yl - yz)/2 give estimates of the individual weights with variance (12/2.
EXAMPLE 12.2.2- If four objects are to be weighed. show how to conduct four weigh-
lngs so that the weight of each object is estimated with variance (12/4. Hint: First weigh
the sum of the objects, then refer to table 12.2.1.
12.3-Analysis of the 2' factorial experiment. The case where' no
interaction appears is illustrated by an experiment (3) on the fluorometric
determination of the riboflavin content of dried collard leaves (table
12.3.1). The two factors were A, the size of sample (0.25 gm., 1.00 gm.)
from which the determination was made, and B, the effect of the inclusion
of a permanganate-peroxide clarification step in the determination. This
was a randomized blocks 'design replicated on three successive days.
The usual' analysis of variance into Replications, Treatments, and
Error 'is computed. Then the factorial effect totals for A, B. and AB are
calculated from the treatment totals. using the multipliers given in table
12.3.1. Their squares are divided by 4r, or 12, to give the contributions
to the Treatments 5.5. The P value corresponding to the F ratio
13.02/8.18 for Interaction is about 0.25: we shall assume interaction ab-
sent. Consequently, attention can be concentrated on the main effects.
The Permanganate step produced a large reduction in the estimated ribo-
flavin concentration. The effect of Sample Size was not quite significant.
343
TABLE 12.3.1
ApPARENT RIBOFLAVIN CoNCENTIlATION (MCG,jGM.) IN CoLLARD LEAVES

I Without With
I Pennanganate Permanganate
0.25gm. 1.00 gm. 0.25 gm. 1.00 gm.
Replication Sample Sample Sample Sample Total

1 39.5 38.6 27.2 24.6 129.9


.
2 43.1 39.5 23.2 24.2 130.0 I,
3 45.2 33.0 24.8 22.2 125.2 ,

Total
I Factorial Factorial
127.8 111.1 75.2 71.0 Effcct Effect
a Total Mean S.E.
(I) b ab
I
Sample Size (A) !
Permanganate (B)'
-I
-I
1
-I
-I
I
I
I
-20.9
-92.7
1--15.4
3.5
1.65
Interaction (AD) -I -I 12.5

Degrees of Mean
Source of Variation Freedom Sum of Squares Square p

Replications 2 3.76
Treatments (3) (765.53)
Sample size I (-20.9)'/12 ~ 36.40 36.40 r.08
Pennanganate I (-92.7)'/12 ~ 716.11 716.11 <0.01
Interaction I (12.5)'/12 13.02 13.02 0.25
Error 6 49.08 8.18

Instead of subdividing the Treatments S.S. and making F-tests, one


can proceed directly to compute the factorial effect means. These are
obtained by diViding the effect totals by 2r, or 6, and are shown in table
12.3.1 beside the effect totals. The standard error of an effect mean is
PF = J2.73 = 1.65. The t-tests of the effect means are of course the
same as the F-tests in the analysis of variance. Use of the effect means
has the advantage of showing the magnitude and direction of the effects.
The principal conclusion from this experiment was that "In the
fluorometric determination of riboflavin of the standard dried collard
sample, the permanganate-hydrogen peroxide clarification step is essen-
tial. Withouttrus step. the mean value is 39.8 mcg. per gram, while with
itthe more reasonable mean of24.4 is obtained." These data are discussed
further in example 12.4.1. .
EXAMPLE 12.3.1-From table 12.3.1, calculate the means of the four treatment
combinations. Then calculate the main effects of A and B. and venfy that they are the same
as the "Effect Means" shown in table 12.3.1. Venfy also that the AB interaction. ifcalcu-
lared by equations (12.2.3) or (12.2.4). is twice the effect mean in table 12.3.1. As alread)
mentioned. the extra divisor 1 2 in the case of an interaction is a convention.
EXAMPLE 12.3.2 ··,From a randomized blocks experiment on sugar beets in Iowa the
numbers of surviving plants per plot were couhted as follows:
344 Chapte, 12: facrorial Experiments

Blocks
Treatments 2 3 4 Totals

None 183 176 291 254 904


Superphosphate, P 356 300 301 271 1228
Potash, K 224 258 244 217 943
P+K 329 283 J08 326 1246
Totals 1092 1017 1144 1068 4321

(i) Compute the sums of squares for Blocks, Treatments, and Error. Verify that the
Treatments S.S. is 24,801, and the mean square for error is 1494.
(ii) Compute the S.S. for P, K, and the PK interaction. Verify that these add to .the
Treatments S.S. and that the only significant effect is an increase of about 34% in plant
number due to P. This result is a surprise, since P does not usually have marked effects on
the number of sugar-beet plants.
(iill._Compute the factorial effect means from the individual treatment means with their
s.e. .J sl/r, and verify that I-tests of the factorial effect means are identical to the F-tests in
the analysis of variance.
EXAMPLE 12.3.3-We have seen how to calculate the factorial effect means (A), (B),
and (AD) fcom the means (ah), (a), (h), and (1) of the individual treatment combinations.
The process can be reversed: given the factorial effect means and the mean yield M of the
experiment, we can recapture the means of the individual treatment combinations. Show
that the equations are:

lab) - M + t (fA) + (B) + (AB))


tal - M + t {(A) - (B) - (AB))
(b) - M + iI-tAl + IB)- (AB))
(1)- M + H -(A) - (B) + lAB))

12.4-The 2' factorial wben interaction is present. Wben interaction


is present, tbe results of a 2' experiment require more detailed study. If
both main effects are large, an interaction that is significant but much
smaller than the main effects may imply merely that there is a minor
variation in the effect of A according as B is at its higher or lower level, and
vice versa. In this event, reporting of the main effects may still be an
adequate summary. But in most cases we must revert to a report based
on the 2 x.2 table.
Table 12.4.1 contains the results (slightly modified) of a 22 experi-
ment in a completely randomized design. The factors were vitamin B'2
(0,5 mg.) and Antibiotics (0, 40 mg.) fed to swine. A glance at the totals
for the four treatment combinations suggests that with no antibiotics,
B12 had little or no effect (3.66 versus 3.57), apparently because intestinal
flora utilized the B,z, With antibiotics present to control the !'.ora, the
effect of the vitamin was marked (4.63 versus 3.10). Looking at the table
the other way, the antibiotics alone decreased gain O. IO versus 3.57),
perhaps by suppressing intestinal flora that synth~~ize B 12 ; but with B12
added, the antibiotics produced a gain by dec~easing the activities of un-
favorable flora.
3.5
TABLE 12.4.1
FACTORIAL EXPERIMENT Willi VITAMIN B 12 ANO ANTIBIOTICS.
AVERAGE DAILY GAIN Of SWINE (PoUNDS)

Antibiotics 0 40 mg.

B" 0 5 mg. 0 5 mg.


I
I
1.30 1.26 1.05 1.52
1.19 1.21 1.00 1.56
1.08 1.19 1.05 1.55

Factorial Factorial
Totals 3.57 3.66 3.10 4.63 Effect Effect
(I) a b ab I Total Mean s.E.
--
B" -I 1 -I 1 1.62 0.270"
Antibiotics -I -I 1 1 0.50 0.083- ±0.035
Interaction 1 -I -I 1 1.44 0.240"

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Treatments 3 0.4124
Error 8 0.0293 0.00366

The summary of the results of this experiment is therefore presented


in the form of a table of the means of the four treatment combinations,
as shown below:
Antibiotics 0 40 mg.
-
8" 0 5 mg. 0 5 mg.
-
Means 1.19 1.22 1.03 1.54

In the analysis of variance, S2 is 0.00366, with 8 d .. The s.e. of the differ-


ence. between any two treatment means is (2s /3) = +0.049. You may
. . . ~
verIfy that the decrease due to antlblOllcs when B' 2 IS absent. and the
increases to each additive when the other is present, are all clearly sig·
nificant.
If, instead, we begin by calculating the factorial effects, as shown in
table 12.4.1, we learn from the factorial effect means that there is a sig-
nificant interaction at the I % level (0.240 ± 0.035). This immediately
directs attention back' to the four individual treatment totals or means.
in order to study the nature of the interaction and seek an explanation.
The main effects both happen to be significant. but are of no interest.
One way of describing the no-interaction situation is to say that the
effects of the two factors are addilire. To illustrate, suppose that the
population mean for the (1) combination (neither factor present) IS 11.
Factor A, when present alone, changes the mean to (~+ ~): Factor B.
346 Cltapter 12: Factorial Experiment.
when present alone, to (p + Pl. If both factors are present, and if their
effects are additive, the mean will become Il + a + p.
With this model, the interaction effect is

(AB) = t [tab) + (I) - (a) - (b») = t [Il + a + P+ Il- Il- a - Il- III = 0
Presence 01' an interaction denotes that the effects are not additive.
With quantitative factors, this concept leads to two other possible
explanations of an interaction found in an experiment. Sometimes their
effects are additive, but on a transformed scale. The simplest example is
that of multiplicative effects, in which a log transformation of the data
before analysis (section 11.17) removeS the interaction.
Secondly, if Xl> X, represent the amounts of two factors in a treat-
ment combination, it is natural to summarize the results by means of a
response function or response surface, which predicts how the response Y
varies as X, and X, are changed. If the effects are additive, the response
function has the simple form
Y = Po + p,X, + p,X,
A significant interaction is a warning that this model is not an adequate
fit. The interaction effect may be shown to represent a term of the form
P12X,X, in the response function. The presence ofa term in X,X, in the
response function suggests that. terms in X, ' and X, Z may also be needed
to represent the function adequately. In other words, the investigator
may require a quadratic response function. Since at least three levels of
each variable are required to fit a quadratic surface, he may have to plan
a larger factorial experiment.
EXAMPLE 12.4.I---{)ur use of the riboflavin data in section 12.3 as an example with
no interaction might be criticized on two grounds: (I) a P value of 0.25 in the test for inter·
action in a small ex.periment suggests (he.possibiiity ofan interaction thaclilargere~metIc
might reveal, (2) perhaps the effects are multiplicative in these data. If you analyze the logs of
the data in table 12.3:1. you will find that the F~value for interaction is now only 0.7. Thus
the assumption of zero interaction seems better grounded on a log scaJe than on the original
scale.

12.5-The general two-factor experiment. Leaving the special case of


two levels per factor, we now consider the general arrangement with
a levels of the first factor and b levels of the second. As before, the layout
oflhe experiment may be completely randomized, randomized blocks, or
any other standard plan.
With a levels, the main effects of A in the analysis of variance now
have (a - I) df, while those of D have (b - I) d.! Since there are ab
treatment combinations, the Treatments S.S. has (ab - I) df Conse-
quently, there remain •
(ab - 1) - (a - 1) - (b - I) = ab - a - b + 1 = (a - I)(b - 1)
df. which may be shown to represent the AD interactions. In the 2 x 2
347
factorial, in which the AB interaction had only one df, the comparison
corresponding to this df was called the AB interaction. In the general
case, the AB interaction represents a set of (a - I)(b - I) independent
comparisons. These can be subdivided into single comparisons in many
ways.
In deciding how to subdivide the AB sum of squares, the investigator
is guided by the questions that he had in mind when planning the experi.
ment. Any comparison among the levels of A is estimated independently
at each of the b levels of B. For a comparison that is of particular interest,
the investigator may wish to examine whether the level of B affects these
estimates. The sum of squares of deviations of the estimates, with the
appropriate divisor, is a component of the AB interaction, with (b - I)
df, which may be isolated and tested against the Error mean square.
Incidentally, since the main effect 'of A represents (a - I) independent
comparisons, these components of the AB interaction jointly account for
(a - 1)(b - I) df and will be found to sum to the sum of squares for A B.
As an illustration, the data in table 12.5.1 show the gains in weight of
male rats under s_x feeding treatments in a completely randomized experi-
ment. The factors were:
A(3 levels): Source of protein: Beef, Cereal, Pork
B(2 levels): Level of protein: High; Low
Often the investigator has decided in advance how to subdivide the
. comparisons that represent main effects and interactions. In more ex-
TABLE 12.5.1
GAINS IN WEIGHT (GRAMS) OF RATS UNDER SIX DIETS

High Protein Low Protein


Cereal Pork Beef Cereal _.. Pork
Beef ,
73 98 94 90 107 49
102 74 79 76 95 82
118 56 96 90 97 73
104 III 98 64 80 86
81 95 102 $6 98 81
107 88 102 51 74 97
100 82 108 72 74 106
87 77 91 90 67 70
117 86 120 95 89 61
III 92 105 78 58 82
Totals 1,000 859 995 792 839 787

Source of Variation Degrees of Freedom Sum of Squares Mean Square F

Treatments 5 4.613.0
A (Source,ofprotei.n) 2 266.5 133.2 0.6
B(Levelofprptein) I 3.168.3 3,168.3 14.8"
AD 2 1.178.2 589.1 '2.7
Error 54 11,585.7 214.6
348 Chapter 12: Factorial Experiments
ploratory situations, it is customary to start with a breakdown of the
Treatments S.S. into the S.S. for A, B, and AB. This has been done in
table 12.5.1. Looking at the main effects of A, the three sources of protein
show no differences in average rates of gain (F = 0.6), but there is a clear
effect of level of protein (F = 14.8), the gain being about 18% larger with
the High level.
For AB, the value of F is 2.7, between the 10% and the 5% level. In
the general two-factor experiment and in more complex factorials, it often
happens that a few of the comparisons comprising the main effects have
substantial interactions while the majority of the comparisons have
negligible interactions. Consequently, the F-test of the AB interaction
sum of squares as a whole is not a good guide as to whether interactions
can be ignored. It is well to look over the two-way table of treatment
totals or means before concluding that there are no interactions, particu-
larly if F is larger than I.
Another working rule tested by experience in a number of areas is
that large main effects are more likely to have interactions than small
ones. Consequently, we look particularly at tbe effects of B, Level of
protein. From the treatment totals in table 12.5.1 we see that high pro-
tein gives large gains over low protein for beef and pork, but only a small
gain for cereal. This suggests a breakdown into: (I) Cereal versus the
average of Beef and Pork, and (2) Beef versus Pork. This subdivision is
a natural one, since Beef and Pork are animal sources of protein while
Cereal is a vegetable source, and would probably be planned from the
beginning in this type of experiment.
Table 12.5.2 shows how this breakdown is made by means of five
single comparisons. Study the coefficients for each comparison carefully,
and verify that the comparisons are mutually orthogonal. In the lower
part of the table the divisors required to convert the squares of the factorial
effect totals into sums of squares in the analysis of variance are given.
Each divisor is n times the sum of squares of the coefficients in the com-
parison (n = 10). As anticipated, the interaction of the animal versus
vegetable comparison with level of protein is significant at the 5% level.
There is no sign of a difference between Beef and Pork at either level.
The principal results can therefore be summarized in the following
2 x 2 table of means.

Mean Rat Gains in Weight per Week (Grams)

Level of Source of Protein


Protein Animal Vegetable Difference S.E.

High 99.8 85.9 + 13.9- ±5.67


Low 79.0 83.9 - 4.9 ±S.67

Difference +20.8" + 2.0


S.E. ±"4.6 ± 6,5
349
TABLE 12.5.2
SUSDfYlSlON OF TIlE 5S, FOil MAIN EfFECTS AND INTERACTIONS

High Protein ! Low Protein Factorial


Comparisons I Beef Cereal Pork Beef Cereal Pork Effect
(Treatment Totals) 1000 859 995 792 839 787 Total

Level of protein +1 +1 +1 -I -I -I 436


Animal vs. vegetable +1 -2 +1 +1 -2 +1 I 178
Interaction with level +1 -2 +1 -I +2 -I 376
Beef vs. pork +1 0 -I +1 0 -I 10
Interaction with level +1 0 -I -I 0 +1 0

Divisor Degrees of Sum or Mean


Comparison for S.S. Freedom Squares Square

Level of protein 60 1 3168.3··


Animal vs. vegetable 120 1 264.0
Interaction with level 120 1 1178.1·
Beef vs. pork 40 1 2.5
InteractJon with level 40 0.0
Error 54 214.6

As a consequence of the interaction, the animal proteins gave sub-


stantially greater gains. in weight than cereal protein at the high level, but
showed no superiority to cereal protein at the low level.
12.6-Response Curves. Frequently, the levels of a factor represent
increasing amounts X of some substance. It may then be of interest to
examine whether the response Y to the factor has a linear relation to the
amount X. An example has already been given in section 11.8, p. 313,
in which the linear regression of yield of millet on width of spacing of the
rows was worked out for a Latin square experiment. If the relation be-
tween Yand X is curved, a more complex mathematical expression is re-
quired to describe it. Sometimes the form of this expression is suggested
by subject-matter knowledge. Failing this, a polynomial in X is often
used as a descriptive equation.
With equally spaced levels of X, auxiliary tables are vailable that
facilitate the fitting of these polynomials. The tables are explained fully
in section 15.6 (p. 460). An introduction is given here to enable them to
be used in the analysis of factorial experiments. The tables are based
essentially on an ingenious coding of the values of X, X", and so on.
With three levels. the values of X are coded as - 1,0, + I, so that they
sum to O. If Y" Y" Y, are the corresponding response torals over n
replicates, the linear regression coefficient b, is :EXYlnI:X', or
(Yj - Y,)/2n. The values of X' are I, 0, 1. Subtracting their mean 2/3
so that they add to 0 gives 1/3, - 2/3, 1/3. Multiplying by 3 in order to
have whole numbers, we get the coefficients 1, -2,1. In its coded form,
this variable is X, = 3X' - 2. The regression coefficient of Yon X, is
350 Chop'er 12: FodorioJ Experim.",.
b, ~ kX, YlnkX,', or (Y, - 2 Y, + Y1 )/6n. The equation for the parab-
ola fitted to the level means of Y is
(12.6.1)
With four levels of X, they are coded - 3, - I, + I, + 3, so that they
are whole numbers adding to O. The values of X' are 9, I, I, 9, with
mean S. Subtracting the mean gives + 4, - 4, - 4, + 4, which we divide
by4 to give the coefficients + I, -I, -I, + 1 for the parabolic component.
These components represent the variable X, = (X' - 5)/4. The fitted
parabola has the same form as (12.6.1), where
hi = (3Y. + Y, - Y, - 3Y1 )/20n : b, =(Y. - Y, - Y, + Y1 )/4n,
the r; being level totals. For the cubic component (term involving X') a
more elaborate coding is required to make this orthogonal to X and X,.
The resulting coefficients are - I, + 3, - 3, + I.
By means of these polynomial components, the S.S. for the main
effects of the factor can be subdivided into linear, quadratic, cubic com-
ponents, and so on. Each S.S. can be tested against the Error mean
square as a guide to the type of polynomial that describes the response
curve. By rule 11.6.1, the contribution of any component kA, Yo to the
S.S. is (t)., YY/nkA(. If the component is computed from the level
means, as in the following illustration, the divisor is (:1:.1. ')jn.
Tabie 12.6.1 presents the mean yields of sugar (cwt. per acre) in an
experiment (4) on beet sugar in which a mixture of fertilizers was applied
at four levels (0, 4, 8, 12 cwt. per acre).
TABLE 12.6.1
LINEAR, QUADRATlC, AND CuBIC COMPONENTS OF REsPoNSE CUllVE

Mixed Fertilizers (Cwt. Per Acre)


o 4 8 12 umof
Mean Yields 34.8 41.1 42.6 41.8 iComponent Squares F

linear
Quadratic
-3
+1
-I
-1
+1
-1
+3
+1 i
I
+22.5
- 7.1
+ 2.5
202.5
100.8
17.0"
8.S·
CubiC -1 +3 -3 +1 2.5 0.2
I ,
Total = Sum of Squares for Fertilizers = 305.8

Error mean square (16 df) :0:; 11.9


---------------------------~
Since each mean was taken over n = 8 replicates, the divisors are
20/8 = 2.5 for the linear and cubic components and 4/8 = 0.5 for the
quadratic component. The Error mean square was 11.9 with 16 df The
positive linear component and the negative quadratic component are
both significant, but the cubic term gives an F less than 1. The conclu-
sions are: (i) mixed fertilizers produced an increase in the yield of sugar, (ii)
the rate of increase fell off with the higher levels.
To fit the parabola, we compute from table 12.6.1,
351
y= 40.08 b, = +22.5/20 = 1.125 b 2 = -7.1/4 = -1.775
The fitted parabola is therefore
t = 40.08 -I- 1.125X - 1.775X2 , (12.6.2)
where t is an estimated mean yield. The estimated yields for 0, 4, 8, 12
cw!. of fertilizers are 34.93, 40.73, 42.98, 41.68 cw!. per acre. Uke the
observed means, the parabola suggests that the dressing for maximum
yield is around 8 cwt per acre.
Table 12.6.2 shows the coefficients for the polynomial components
and the values ofI:2' for factors having from 2 to 7 levels. With k levels
a polynomial of degree (k - I) can be made to fit the k responses exactly.
TABLE 12.6.2
CoEFFICIENTS AND DIVISORS FOR SETS OF ORTHOGONAL CoMPoNENTS IN REGRESSION
IF X Is SPACJID AT EQUAL INTfRVALS

Number of Levels I
Degree of
Poly· Divisor
nomial Comparison I 2 3 4 5 6 7 :EA'

I Linear -I +1 2
2 Linear -I 0 + I 2
Quadratic +1 -2 + I 6

3 Linear
I -3 -I + I + 3 20
Quadratic I +1 -I - I + I 4
Cubic -I +3 - J + I 20 .
4 Linear -2 -I 0 + I + 2 IO
Quadratic +2 -I - 2 - I + 2 14
Cubic -I +2 0 - 2 + I 10
Quartic +1 -4 + 6 - 4 + I 70
5 Linear -5 -3 - I + I + 3 +5 70
Quadratic +5 -1 - 4 - 4 - I +5 84
Cubic -5 +7 + 4 - 4 - 7 +5 180
Quartic +1 -3 + 2 + 2 - 3 +1 28
Quintic -I +5 -10 + 10 - 5 +1 252
6 Linear -3 -2 - I 0 + I +2 +3 28
Quadratic +5 0 - 3 - 4 - 3 0 +5 84
-I +1 + I 0 - I -I
Cubic
Quartic
Quintic
+3
-I
-7
+4
+ I
- S
+ 6
0
+ I
+ 5
-7
-4
+I
+3
+1
I IS4
84
6

Sextic +1 -6 +IS -20 +IS -6 +1 i 924

EXAMPLE 12.6.1-ln the same sugar-beet experiment. the mean yield of tops (green
matter) for 0, 4, 8,12 cwt. fertilizers were 9.86, 11.58,13.95, 14.95 cwt. per acre. The Error
mean square was 0.909. Show that: (i) only the linear component is significant. there being
no apparent decline in response to the higher applications. (ii) the S.s. for the linear, quad-
ratic, and cubic components sum to the 5.S. between levels. 127.14 with 3 d.f. Remember
tbat the means are over 8 replicates.
352 CJ,apter 12: Factorial Experim_
EXAMPLE 12.6.2--From the results for the parabolic regression on yield of sugar,
the estimated optimum dressing can be computed by calculus. From equation 12.6.2 the
fitted parabola is

l' ~ 40.08 + 1.125X - 1.775X"


where X 2 = (X2 - 5)/4. Thus
y ~ 4li.OB + 1.125X - 0.444(%' - 5)

Differentiating, we find a turning value at X = 1.125/0.888 = 1.27 on the coded scale.


You may verify that the estimated maximum sugar yield is 43.0 cwt., for a dressing 01'8.5 cwt.
fertilizer.
12.7-Response cprves in two-factor experiments. Either or both
factors may be quantitative and may call for the fitting of a regression as
described in the previous section. As an example with one quantitative
TABLE 12.7.1
YIELD OF COWPEA HAY (PoUNDS Pat 1/100 MORGEN PLOT) FROM THREE VARIETIES

Blocks

Varieties Spacing (In.) I 2 3 4 II Sum

I 4
8
56
60
45
50
43
45
46c
48
1 190
203
12 66 57 50 50 223
I
II 4
8
65
60
61
58
60
56
63
60
I 249
234
12 53 .53 48 55 I,
209

1II 4 , 60 61 50 53 224
8
t2
,
62
73
68
77
67
77
60
65
I, 257
292
I
Sum 555 530 496 500
I 2,081

Spacings

Varieties 4 8 12

1 190 203 223 ~t6


11 249 234 209 692
III 224 257 292 773

Sum 663 694 724 2,081

Degrees of Freedom Sum of Squares Mean Square

Blocks 3 255.64
Varieties, V 2 J027.39 513.70'"
Spacings, S 2 155.06 77.53·
Interactions. VS 4 765.44 191.36··
Error 24 424.11 17.67
353
factor, table 12.7.1 shows the yields in a 3 x 3 factorial on hay (5), one
factor being three widths of spacing of the rows, the other being three
varieties.
The original analysis of variance, a the foot of table 12.7.1, reveals
marked VS (variety x spacing) interactions. The table of treatment com-
bination totals immediately above shows that there is an upward trend
in yield with wider spacing for varieties I and HI but an opposite trend
with variety II. This presumably accounts for the large VS mean square
and warns that no useful overall statements can be made from the main
effects.
To examine the trends of yield Yon spacing X, the linear and quad-
ratic components are calculated for each variety, table 12.7.2. The fac-
torial effect totals for these components are computed first, then the cor-
responding sums of squares. Note the following results from table 12.7.2:
(i) As anticipated, the linear slopes are positive for varieties I and III
and negative for variety II.
(ii) The linear trend for each variety is significant at the I% level,
while no variety shows any sign of curvature, when tested against the
Error mean square of 17.67.
TABLE 12.7.2
LINEAR AND QUADRATIC CoMPONENTS FOR EACH VARIETY IN COWPEA EXPERIMENT
.
4" 8"' 12"
Totals for Components
Linear -I 0 +1
Quadratic +1 -2 +1 Linear Quadratic

Variety I 190 203 223 33 7


Variety II 249 234 209 -40 -10
Variety III 224 257 292 I 68 2

Sum 663 694 724 61 - I

Contributions to Sums of Squares

(33)' (7)'
Variety I: Linear, (4)(2) = 136.12·· Quadratic, (4)(6) ~ 2.04

( - 40)2 = 200.00•• (- (0)'


II:
(4)(2) (4)(6) ~ 4.17

(68)2 = 578.00••
(2),
III: ----~0.17
(4)(2) (4){6)

Total

Verification: 914.12 + 6.38 = 155,06 + 765.44 (=s + SV),J= 6


354 Chapt.r 12: Factorial Experiments
(iii) The sum of these six S.S. is identical with the S.S. for spacings
and interactions combined, 920.50.
(iv) If the upward trends for varieties I and III are compared, the
trend for variety III will be found significantly greater.
To summarize, the varieties have linear trends on spacing which are
not the same. Apparently I and III have heavy vegetative growth which
requires more than 12" spacing for maximum yield. In a further experi-
ment the spacings tested for varieties I and III should differ from those
for II.
EXAMPLE J2.7.1-ln the variety){ spacing experiment, verify the statement that the
In
linear regression of yield on width of spacing is significantly greater for variety than for
variety J.
EXAMPLE 12.7.2-If the primary interest in this experiment were in comparing the
varieties when each has its rughest·yielding spacing. we might compare the totals 223 (I),
249 (II), and 292 (III), Show that the optimum for III exceeds the others at the 1% level.

U.S-Example of a response surface. We turn now to a 3 x 4 experi-


ment in which there is regression in each factor. The data are from the
Foods and Nutrition Section of the Iowa Agricultural Experiment Sta-
tion (6). The object was to learn about losses of ascorbic acid in snap-
beans stored at 3 temperatures for 4 periods, each 2 weeks longer than the
preceding. The beans were all harvested under uniform conditions before
eight o'clock one morning. They were prepared and quick-frozen before
noon of the same day. Three packages were assigned at random to each
of the 12 treatments and all packages were stored at random positions in
the locker, a completely randomized design.
The sums of 3 ascorbic acid determinations are recorded in table
12.8.1. It is clear thanhe concentration of ascorDie acid decreases with
TABLE 12.8.!
SUM OF THREE ASCORBIC ACID DETERMINATIONS (MG./1OO G) FOR &'CH OF 12 TREATMENTS
IN A 3 x 4 FACTORIAL EXPERIMENT ON SNAPBEA:NS

Weeks of Storage

Temperature, F.o 2 4 6 L Sum

o 45 47 46 46 184
10 45 43 41 37 166
20 34 28 21 16 99
Sum 124 118 108 99 449

Degrees of Freedom Sum of Squares Mean Square

Temperature, T 2 334.39
Two-week Period, P 3 40.53
Interaction, TP 6 34.05
Error· 24 0.706

It Error (packages of same treatment) was calculated from original data not recorded
here.
higher storage temperatures and, except at OQ, with storage time. It looks
as if the rate of decrease with temperature is not linear and not the same
for the several storage periods. These conclusions, suggested by inspec-
tion of table 12.8.1, will be tested in the following analysis:
One can look first at either temperature or period; we chose tem-
perature. At each period the linear and quadratic temperature com-
parisons ( - I, 0, + I; + I, - 2, + I) are calculated :

Weeks of Storage 2 4 6 8 Total

Linear, 1L -II -19 -25 -30 -85


Quadratic. To -II -11 -15 -12 -49

The downward slopes of the linear regressions get steeper with time. This
will be examined later. At present, calculate sums of squares as follows:

T. - (- 85)' - 301 04"


L - (12)(2) - .

1: = (-49)2 _ 33.35"
a (12)(6)-

The sum is the sum of squares for T, 301.04 + 33.35 = 334.39. Sig-
nificance in each effect is tested by comparison with the Error mean square,
0.706. Evidently the regressions are curved, the parabolic comparison
being significant; quality decreases with accelerated rapidity as the tem-
perature increases. (Note the number of replications in each temperature
total, 4 periods times 3 packages = 12.)
Are the regressions the same for all periods? To answer this, calculate
the interactions of the linear and the quadratic comparisons with period.
The sums of squares for these interactions are:

T. P : (_11)2 + ... + ( - 30)' _ T. = 33.46" {3df.)


L (3)(2) L

1: P : (_11)2 + ... + (-121' _ 1: = 0.59 (3 d.f.)


a (3)(6) a
Rule 12.B.l. These calculations follow1'rQm a new rule. Ifa com-
parison L, has been computed for k different levels of a second factor,
the Interaction S.S. of this comparison with the second factor is

l:L,z (l:L;),.
niL,') - kn(l:J.z) (i = 1,2, ... k)

with (k - I) d.f. Further, the term (H,Y/kn(l:)c2) is the overall S.S.


(1 df.) for this comparison. The sum of 1i.p and TaP is equal to the sum
of squares for TP. The linear regressions decrease significantly with
356 Chapter 12: Factorial Experiments
period (length of storage) but the quadratic terms may be the same for all
periods, since the mean square for TaP, 0.59/3 = 0.20, is smaller than the
Error mean square.
Turning to the sums for the 4 periods, calculate the 3 comparisons:

Sum of
Sums 124 118 108 99 Comparison Squares
---
Linear, PL -3 -I +1 +3 -85 40.14··
Quadratic. p~ +1 -I -I +1 - 3 0.25
Cubic, Pc -I +3 -3 .+1
I 5 0.14

Sum = Sum of Squares for Periods 40.53


.. - ...

This indicates that the population regressions on period may be linear, the
mean squares 0.25 for PQ and 0.14 for Pc being both less than 0.706, the
Error mean square.
We come now to the new feature of this section, the regressions of TL
and TQ on period. TL , the downward slope of the vitamin with tempera-
ture, has been calculated for each period; the question is, in what manner
does TL change with peripd?
For this question, we can work out the linear, quadratic, and cubic
components of the regression of Tt. on period, just as was done above for
the sums over the 4 periods.

: Sum of
-II -19 - 25 - 30 Comparison Divisor Squares
------;-----
Linear, TLPi. - 3 - 1 T 1 + 3· - 63 (3)(2)(20) 33.08-
Quadratic, TLPU + 1 - 1 - 1 + 1 I 3 (3)(2)(4) 0.38
Cubic, TLPC - 1 + 3 - 3 +_I_~I'__-_I__ .L~~(..2)_(2_0..) -t-_O._O_I_
Sum = Sum of SQuares for 1[.P 33.47

Rule 12.B.2. Note the rule for finding the divisors. For each in-
dividual TL (-II, -19, etc.) the divilK>r was (2)(3). We now have a com-
parison among these Te's, bringing in a further factor 20= 32 + 12 + 12 + 32
in TLPL. Thus the S.S. 33.08 = (-63)2/120. The sum of the three regres-
sion sums of squares is 33.47, which equals TLP. From the tests of the
linear, quadratic, and cubic components, we conclude that the linear re-
gression on temperature decreases linearly with length of storage.
Proceeding in the same way with TQ:
(-7)'
TaPL = (3)(6)(20) = 0.14
(3)2
TaPa = (3)(6)(4) = 0.12

P _ (11)2 =
Ta c - (3)(6)(20) 0.34
3fT
TABLE 12.8.2
ANALYSIS OF V AlliANCE OF AscoIlBIC ACID IN SNAP BEANS

Source of Variation Degrees of Freedom Sum of Squares Mea. Square

Temperature: (2) (3:14.39)


T, 1 301.04··
TO 1 33.35··
Period: (3) (40.53)
PL 1 40.14··
p. 1 0.2$
Pc 1 0.14
Interaction: (6) (:14.05)
TtPt I 33.08··
TL,PQ 1 0,31
T"Pc 1 0.01
TgPz. 1 0.14
TOPQ 1 0.12
TgPe I Q.:14
Error 24 0.706

The sum is ToP = 0.60. Clearly there is no change in Tg with period. The
results are collected in table 12.8.2
In summary, TL and TQ show that the relation of ascorbic acid to tem-
perature is parabolic, the rate of decline increasing as storage time
lengthens (TLP L). The regression on period is linear, sloping down-
ward more rapidly as temperature increases. In fact, you will note in
table 12.8.1 tbat at the coldest temperature, O°F, there is no decline in
amount of ascorbic acid with additional weeks of storage.
These results can be expressed as a mathematical relation between
ascorbic acid Y. storage temperature T, and weeks of storage W. As we
have seen, we require terms in TL, Ta, PL' and TLPL in order to describe
the relation adequately. It is helpful to write down these polynomial
coefficients for each of the 12 treatment combinations, as shown in table
12.8.3.
For the moment, think of the mathematical relation as having the
form

where f is the predicted ascorbic acid total over 3 replications, while


Xl = TL , X, = Ta, X, = P" and X. = .TLPL . The regression coefficient
b, = kX, Y/kX,'. The quantities l:X, Y, which were all obtained in the
earlier analysis, are given at the foot of table 12.8.3, as well as the divisors
l:X,'. Hence, the relation is as follows:
?= 37.417 - 1O.625XI - 2.042X, - 1.417X, - 1.575X. (12.8.1)
Since the values of the X, are given in table 12.8.3, the predicted
values? are easily computed for each treatment combination. For ex·
ample, forO°F. and 2 weeks storage.
358 C".".,., 12: Fodoriol Experim.""
TABLE 12.8.3
CALCULAnoN OF THE REsPONSE SU1lFACE

y TL = T<I,= PL = TLPL=
Temp. Weeks Totals O.I(T-IO) 3TL -2 W-5 0.I(T-IO)(W-5) f
0' 2 45 - I + I - 3 + 3 45.53
4 47 - I + I - I + I 45.84
6 46 - 1 + 1 + I - I 46.16
8 46 - I + I + 3 - J 46.47
100 2 45 0 - 2 - 3 0 45.75
4 43 0 - 2 -I 0 42.92
6 41 0 - 2 +1 0 40.08
8 37 0 - 2 + 3 0 37.25
10" 2 34 + I + 1 - 3 - 3 33.73
4 28 + I + I - I - I I
I
27.74
6 21 + I' + I + I + I 21.76
16 15.77
8 +1 + I + 3 + 3
i
l:X,Y I 449 -85 -49 -85 -63
Divisor for b _i_ 12 8 24 60 40 i
I

r= 37.417 - (10.625)( -I) - 2.042( + 1) - 1.417( - 3) - (1.575)( + 3)


= 45.53,
as shown in the right-hand column of table 12.8.3 .
. By decoding, we can express the prediction equation (12.8.1) in terms
of T CF.) and W (wee!>s). You may verify that the relations between
XI(TLl, X,(TQ), X,(PL), X.(TLPLl and T and Ware as given at the top of
table 12.8.3. After making these substitutions and dividing by 3 so that
the prediction refers to the ascorbic acid mean per treatment combination,
we have _~.

Y= 15.070 + 0.3167T - 0.02042Tz + 0.052SW - O.05250TW (12.S.2)

Geometrically, a relation of this type is called a response surface, since we


have now a relation in three dimensions Y, T, and W. With quantitative
factors, the summarization of the results by a response surface has proved
highly useful, particularly in industrial research. If the obiective of the
research is to. maximize Y, the equation shows the combinations of levels
of the factors that give responses close to the maximum. Further accounts
of this technique, with experimental plans specifically constructed for
fitting response surfaces, are given in (7) and (S). The analysis in this
example is based on (6).
A word of warning. In the example we fitted a mUltiple regression
of Yon four variables Xl' X" X" X.. The methods by which the regres-
sion coefficients h, were computed apply only if the X, are mutually
orthogonal, as was the case here. General methods are presented in
chapter 13.
359
lZ.9-Three-factor experiments; the 23. The experimenter often re-
quires evidence about the effects of 3 or more factors in a common en-
vironment. The simplest arrangement is that of 3 factors each at 2 levels.
the 2 x 2 x 2 or 2' experiment. The eight treatment combinations may he
tried in any of the common experimental designs.
The data in table 12.9.1 are extracted from an unpublished random-
ized blocks experiment (9) to learn the effect of two supplements to a com
ration for feeding pigs. The factors were as follows:
Lysine (L) : 0 and 0.6%.
Soybean meal (P) : Amounts added to supply 12% and 14% protein.
Sex (S) : Male and Female.

TABLE 12.9.1
A \'EJtAGE DAJL Y GArNS Of PIGs (N 2 3 FACTORIAL ARRANGEMfNT Of TREA TM£NTS.
RANDOMIZED BLOCKS EXPERIMENT

Ly· Pro· Replications (Blocks) Treat- Sum


sine tein ment for 2
o. I 3 7 8 Sum Sm"
% /0 S"" 2 4 5 6

0 12 M 1.11 0.97 1.09 0.99 0.85 1.21 1.29 0.96 8.47


F 1.03 0.97 0.99 0.99 0.99 1.21 1.19 1.24 8.61 17.08
14 M 1.52 1.45 1.27 1.22 1.67 1.24 1.34 1.32 11.03
F 1.48 1.22 1.53 1.19 1.16 1.57 1.13 1.43 10.71 21.74
0.6 12 M 1.22 1.13 1.34 1.41 1.34 1.19 1.25 1.32 10.20
F 0.87 1.00 1.16 1.29 1.00 1.14 1.36 1.32 9.14 19.34
14 M 1.38 1.08 1.40 1.21 1.46 1.39 1.17 1.21 10.30
F 1.09 1.09 1.47 1.43 1.24 1.17 1.01 1.13 9.63 19.93

Replication Sum 9.70 8.91 10.25 9.73 9.71 10.12 9.74 9.93 i ! 78.09
------- -
Degrees of Freedom Sum of Squares Mean Square

Replications 7 0.1411
Treatments 7 0.7986 0.1141··
Error 49 1.0994 0.0224

With three factors there are three main effect.. L. P, and S; three t'vo-
factor interactions. SP, SL, and LP; and a three:Jactor interaction SLP.
The comparisons representing the factorial effect totals are set out in
table 12.9.2. The coefficients for the main effects and the two-factor inter-
actions should present no difficulty, these being tbe same as in a 22 fac-
torial. A useful rule in the 2" series is that the coefficients for any two-
factor interaction like SP are the products of the corresponding coeffi-
cients for the main effects Sand P.
The new term is the three-factor interaction SLP. From table 12.9.2
the SP interaction (apart from its divisor) can be estimated at the higher
level of L as
10.20 - 9.14 - 10,30 + 9.63 = +0.39
360 CIJapIer 12: FacIori<JI ~

TABLE 12.9.2
SEVEN CouPAItJSONS IN 2 3 FACTQIUAL ExPEaDIENT ON Ptos

Lysine "" 0 Lysine"'" 0.6%

P-l2% P_14% P-12% p= 14%

M F M F M F M F Factorial
Effect Sumo{
Effects 8.47 8.61 11.03 10.71 10.20 9.14 10.30 9.63 Total Squares

Sex.S -I +1 -I +1 -I +1 -I +1 -1.91 0.0570


Protein, P -I -I +1 +1 -I -I +1 +1 5.25 0.4307··
SP +1 -I -I +1 +1 -I -I +1 -0.07 0.0001
Lysine, L -I -I -I -I +1 +1 +1 +1 0.45 0.0032
SL +1 -I +1 -I -I +1 -I +1 -1.55 0.0375
PL +1 +1 -I -I -I -I +1 +1 -4.07 0.2588··
SPL -I +1 +1 -I +1 -I -I +1 0.85 0.0113

Total 0.7986

An independent estimate at the lower level of L is


8.47 - 8.61 - 11.03 + 10.71 = -0.46
The sum of these two quantities, -0.07, is the factorial effect total for SP.
Their difference, +0.39 - (-0.46) = +0.85, measures the effect of the
level of L on the SP interaction. lfwe compute in the same way the effect
of P on the SL interaction, or of S on the PL interaction, the quantity
+0.85 is again obtained. It is called the factorial effect total for SLP.
Such interactions are rather difficult to grasp. Fortunately, they are often
negJigjbJe except in experiments that have large main effects. Asignificant
three-factor interaction is a sign that the corresponding 3-way table of
means must be examined in the interpretation of the results.
As usual, the square of each factorial effect total is divided by,,(:E.\2).
where" = 8 and :E.\2 ='8, the denominator being 64 in every case. As a
check. the total of the sums of squares for the factorial effects in table
12.9.2 must add to the Treatments sum of squares in table 12.9.1, 0.7986.
The only significant effects are the main effect of P and the PL inter-
action. The totals for the P x L 2-way table are shown in the right hand
column of table 12.9.1. With no added lysine, the higher level of protein
gave a substantially greater daily gain than the lower level, but with
added lysine. this gain was quite small. The result is not surprising, since
soybean meal contains lysine. Lysine increased the rate of gain at the
lower level of protein but decreased it at the higher level.
In view of these results there is no interest in the main effects of P or of
L. The experimenter has learned that gains can be increased either by a
heavier addition of soybean meal or by the addition of lysine, whichever
is more profitable: he should not add both. The absence of any interac-
tions involving 5 gives some assurance that these results bold for both
males and females.
361
The 2' factorial experiment has proved a potent research weapon in
many fields. For further instruction on analysis, with examples, see (7),
(8), and (10).

t:UO-lbree-factor experimenlll; a 2 x 3 x 4, This section illustrates


the general method of analysis for a three-factor experiment. The data
come from the experiment drawn on in the previous section. The factors
were Lysine (4 levels), Methionine (3 levels), and Soybean Meal (2 levels
of protein), as food supplements to corn in pig feeding. Only th'e males in
two replications are used. This makes a 2 x 3 x 4 factorial arrangement
of treatments in a randomized blocks design. Table 12.10.1 contains the
data, with the computations for the analysis of variance given in detail.
1. First forro the sums for each treatment and replication, and com-
pute the total S.S. and the S.S. for treatments, replications, and error (by
subtraction).
2. For each pair of factors, forro a two-way table of sums. From the
Lx Mtable (table A), obtain the total S.s. (II df.) and the S.S. for Land
M. The S.S. for the LM interactions is found by subtraction. The M x P
table supplies the S.S. for M (already obtained), for P, and for the MP
interactions (by subtraction). The L x P table provides the S.S. for the
LP interactions.
3. From the S.S. for tteatments subtract the S.S. for L, M, P, LM,
M P, and LP to obtain that for the LMP three-factor interactions.
The analysis of variance appears in table 12.10.2, and a further
examination of the results in examples 12.10.1 to 12.10.3.
EXAMPLE 12.10.1-[n table 12.10.2, for L, M. MP, and LMP the sums of squares
are all so small that no single degree of freedom isolated from them could reach significance.
But LM and LP deserve further study.
In the LM summary table A. in table 12.10.1, there is some evidence of interaction
though the overall test on 6 degrees of freedom doesn't detect it. Let Us look at the linear
effects. Fint, calculate ML (-1,0, + 1) for each level of.1ysine:

-0.08, -0.27, 0.57, 1.07


Next, take the linear effect of lysine (-3, -I, + I, + 3) in these M L ; the tcsult, 4.29. Finally,
application of Rule 12.8,2 yields the sum of squares .. ~

(4.29)'
LLML ~ (4)(2)(20) ~ 0.1150,

which is just shott of significance at the 5% level. None of the other 5 comparisons is sig-
nificant. In the larger experiment of which this is a part, t"tML was significant. What in-
terpretation do you suggest?

EXAM PLE 12.10.2-10 the LP summary table C. the differences between 14% and J1° <I'
2.ll. 2.07. 0.29. 0.l6,

sugge!;t an interaction: the beneficial effect of the higher level of protein decreases a~ more
lysine is added. By applying the multipliers - 3, - I. + I. + 3. to the above' figures. W(" ob-
tain the LLPI. effect total = -6.55. By Rule 12.8.2.
362 Cltopt.r 12: Foctoricrl &peri_nil
TABLE 12.10.1
THllEE·FACTOIl EXPERIMENT (2 x 3 x 4) IN R.ANooMIzm BLOCKS. AVERAGE DAILY
GAlNS Of Pl:os FED V ARlOUS PEKCENTAO!!S OF SUPPLBMENTARY L \'SINE,
METHJONINE, AND PROTEIN

c-t-t-
Replications (Blocks)
I Methionine. Protein, P I 2
Treatment
Total

o 0 12 I. II 0.97 2.08
14 i.S2 I.4S 2.97
0.025 12 1.09 0.99 2.08
14 1-27 1.22 2.49
0.050 12 0.85 1.21 2.06
14 ].67 ].24 2.91

0.05 0 12 1.30 ],00 2.30


14 US 1.53 3.08
0.025 12 ].03 1.21 2.24
14 ].24 1.34 2.58
0.050 12 1.12 0.96 2.08
14 1.76 ].27 3.03

0.10 0 12 ].22 1.13 2.35


14 1.38 1.08 2.46
0.025 12 ].34 ].41 2.75
14 1.40 1.21 2.61
0.050 12 ].34 1.19 2.53
14 ],46 1.39 2.85

0.15 0 12 1.19 ].03 2.22


14 0.80 ].29 2.09
0.025 12 ],36 1.16 2.52
14 ].42 1.39 2.81
0.050 12 1.46 ].03 2.49
14 ],62 ].27 2.89
i I
Total 3].50 28.97

Computations:
]. C = (60.47)'/48 = 76.1796
2. Total: 1.11 2 + 0.97 1 + ... + 1.622 + 1.271 - C = 2.0409
3. Treatments: (2.08' + 2.97' + ... + 2.89')/2 - C = 1.2756
4. Replications: (31.50' + 28.97')/24 - C = 0.1334
S. Error,:;Z,Q409 - (]'2756 + 0.1334) = 0.6319

Summary Table A

Lysine
Methionine 0 0.05 0,10 0,15 Total

0 5.05 5.38 4.81 4.31 19.55


oms 4.57 4.82 5.36 5.33 I 20.08
0.050 4.97 5.11 5.38 5,38 i 20.84

Total 14.59 15.31 15.55 15,02 60.41


363
TABLE 12.1O.I--(Continued)

Computations (continued):
6. Entries are sums of 2 levels of protein; 5.05 -= 2.08 + 2.97, etc.
7. Total in..4: (5.05 1 + ... + 5.38 2 )/4 - C:o 0.3496
8. Lysine. L: (14.59' + ... + 15.02')(12 - C ~ 0.1>427
9. Methionine. M: {I9.55' + 20.08' + 20.84')(16 - C = 0.0526
10. LM: 0.3496 - (0.0427 + 0.0526) = 0.2543

SUMMAAY TABLE B

Protein I
Methionine 12 14 Total

0 8.95 10.60 19.55


0.025
0.050
9.59
9.16
10.49
11.68
I 20.08
20.84

Total 27.70 32.77 60.47

Computations (contmued):
II. Entries are sums of 4 levels of lysine; 8.95 == 2.08 + 2.30 + 2.35 + 2.22. etc.
12. Total in B: (8.95' + ... + 11.68')(8 - C = 0.6702
13. Protein. P: (27.70' + 32.77')(24 - C = 0.5355
14. MP: 0.6702 - (0.5355 + 0.0526) - 0.0821

Summa".. Table C

Lysine

Protein 0 0.05 0.10 0.15 Total

12 6.22 6.62 7.63 7.23 27.70


14 8.37 8.69 7.92 7.79 32.77

Total 14.59 1531 15.55 15.02 60.47

computations (continued):
IS. Entries are sums of 3 levels of methionine; 6.22 "'" 2.08 + 2.08 + 2.06, etc.
16. Total in C: (6.22' + ... + 7.79')(6 - C = 0.8181
17. LP: 0.8181 - (0.5355 + 0.0427) = 0.2399
18. LMP: 1.2756 - (0.0427 + 0.0526 + 0.5355 + 0.2543 + 0.0821 + 0.2399)
== 0.0685 -....

L P _ (6.55)' _
L L - (6)(2)(20) - 0.1788.

F= 0.1788/0.0275 = 6.5(). P = 0.025. This corresponds to the highly significant effect ob-
serVed in table 12.9.2. where an interpretation was given.
DeductingLLPL from the LPsum of squares in table 12.10.2.0.2399 - 0.1788 ::::; 0.0611.
sbows tht neither of the other two comparisons can be significant.
EXAMPLE 12:10.3-The investigator is often interested in estimates of differences
rather than in tests of significance. Because of the LP interaction he might wish to estimate
tbe: effect of protein with no lysine. Summary table C shows this mean difference:
364 Chapl.r 12: Faeloricrl &perl"",,,,.
TABLE 12.10.2
ANALYSIS OF VARIANCE OF 3-FACTOR PIG EXPERIMENT.
RANDOMIZED BLOCKS DEsiGN

Source of Variation I Degrees of Freedom Sum of Squares Mean Square

Replications 1 0.1334
Lysine, L(I = 4) 3 0.0427 0.0142
Methionine, M(m = 3) 2 0.0526 0.0263
Protein, Pip = 2) 1 0.5355 0.5355··
LM 6 0.2543 0.0424
LP 3 0:2399 0.0800
MP 2 0.0821 0.0410
LMP 6 0.0685 0.0114
Error (r = 2) 23 0.6319 0.0275

(8.37 - 6.22)/6 = O.361b./day. (The justification for using alllevcls of methionine is that
[here is little evidence of either main e eet or interaction with I'rotein.) The standard error
of the mean difference is ± )( .0275)/6 = O.0961b./day, Verify that the 95% interval is
from 0.16 to 0.561b./day.

12.ll-Expected values of mean squares. In the analysis of variance


of a factorial experiment. the expected values of the mean squares for
main effects and interactions can be expressed in terms of components of
variance that are part of the mathematical model underlying the analysis.
These formulas have two principal uses. They show how to obtain un-
biased estimates of error for the comparisons that are of interest. In
studies of variability they provide estimates of the contributions made by
different sources to the variance of a measurement.
Consider a two-factor A x iJ experiment in a completely randomized
design, with a levels of A, b levels of B, and n replications. The observed
value for the kth replication of the ith level of A and the jth level of B is
(12.11.1)

where i = I .. a, j = 1 ... b, k = I ... n. (Iftbe plan is in randomized


blocks or a Latin square, further parameters are needed to specify block,
row, or column effects.)
. The parameters ~, and Pi' representing main effects, may be fixed or
random. If either A or B is random, the corresponding 11, or Pi are as-
sumed drawn from an infinite population with mean zero, variance 11}
or 11.'. The (~~)'i are the two-factor interaction effects. They are random
if either A or B is random, with mean 0, variance 11...2. As usual, the s'it
have mean 0, variance (1'2.
Before working out the expected value of the mean square for A,
we must be clear about the meaning of main effects. The relevant and
useful way of defining the main effeG! of A, and consequently the expected
value of its mean square. depends on whether the other factor B is fixed or
random.
365
To illustrate the distinction, let A represent 2 fertilizers and B 2
fields. ExperimenlJll errors £ are assumed negligible, and results are as
follows:

., .,
Fertilizer
Q;a; -4 1

I 10 17 +7
Field
2 18 13 -5
Mean 14 IS +1

When B is fixed, our question is: What is the average difference between
a2 and a, over these two fields? The answer is that a2 is superior by I unit
(15 - 14). The answer is exact, since experimental errors are negligible
in this example. But if B is random, the question becomes: What can
be inferred about the average difference between a2 and a, over a popula-
tion of fields of which these two fields are a random sample? The differ-
ence (a2 - a,) is + 7 in field I and -5 in field 2, with mean 1 as bef,?re.
The estimate is no longer exact, but has a standard error (with 1 df),
which may be computed as.J {7 - (- 5)}'/4 = ±6. Note that this slJln-
dard error is derived from the AB interaction, this interaction being. in
fact, {7 - (-5)}/2 = 6.
To sum up, the numerical estimates of the main effects of A are the
same whether B is fixed or random, but the population parameters being
estimated are not the same, and hence different standard errors are re-
quired in the two cases.
From equation 12.11.1 the sample mean for the ith level of A is

X, .. = I' + IX, + jJ + (IXP),. + i'., .. (12.11.2)

where jJ = (P, + '" + P.)/b, (~7i),. = {(IXP)" + ... + (IXP) .. }/b and ii, .. is
the average ofnb independent values of e.
When B is fixed, the true main effects 'of A are the differences of the
quantities (IX, + (IXP),,) from level to level of A. In this case it is cus-
tomary, for simplicity of notation, to redefine the parameter 0: 1 as a./ = eli
+ (i{J),. Thus with B fixed, it follows from equation 12.11.2 that
X, .. - X . .. = IX;' -Ii' + ii, .. - ii... (12.11.3)

From this relation the expected value of the mean square for A is
easily shown to be

E(A) = E [nbr.(X, .. - X .. .)2 ] = ",nb_:r.:.:_(1X2 ,'_-.,.."_-'),-2 + (12 (12.11.4)


a-I a-I
366 CItoptH 12: F"dori,,1 Experi_
The quantity L(~i - ~')'/(a - I) is the quantity previously denoted
bYK/.
If A is random and B is fixed, repeated sampling involves drawing a
fresh set of a levels of the factor A in each experiment, retaining the same
set of b levels of B. In finding E(A) we average first over samples that
happen to give the same set of levels of A, this being a common device in
statistical theory. Formula 12.11.4 holds at this stage. When we average
further over all sets of a levels of A that can be drawn from the population,
KA' is an unbiased estimate of" A', the population variance of ~, Hence,
with A random and B fixed,
E(A) = nb"A' + ,,'
Now consider B random and revert to equation 12.11.2.

Xi" = f.' + ~i + Ii + (~P)i' + ii, .. (12.11.2)

In each new sample we draw fre.h values of Pj and of (~P)i] so that Pand
("ifJ),- change from sample to sample. Since, however. the popUlation
means of P. (~fi)i' and B, .. are all zero, the population mean of Xi" is
f.' + 1%;. Consequently, the population variance of the main effects of A
is defined as K A' = L(~i -Iii) f(a - I) if A is fixed. or as the variance
"A' of the ~'s if A is random. But since
Xi" - X ... = ~, - ~ + (~i' - (ap) .. + iii" - t ....

the expected value ofthe mean square of A now involves"A/ as well as ,,'
It follows that when B is random,
E(A) = nbKA' + n"Ai + ,,' (A fixed)
E(A) = nbaA' + n"Ai + (1' (A random)
The preceding res';,lts are particular cases of a more general formula.
If the population of levels of B is finite, containing B' levels of which b
are chosen at random for the experiment,

E(A) = nb"A + n, (B' -b) , +",


~ "A8

This case occurs, for instance, if a combine of H' factories or cotton


growers carries out experiments in a random sample of b factories or
fields. If b = B' the term in "A8' vanishes and we regard factor B as fixed.
As B' tends to infinity, the coefficient of "A/ tends to n. factor B being
random. If A is fixed. ,,} becomes K}.
The AB mean square is derived from the sum of squares of the terms
(Xij . - X", - X. j . + X .. ). From the model. this term is
I,fi)ij - (~P)i'- (~fJ)'1 + 1,fJ). + Bij' - Bi •· - B. j . + t ...
367
Unless both A and B are fixed, the interaction term in the above is a
random variable from sample to sample, giving
E(AB) = no- Ai + cr 2
With both factors fixed, crAl is replaced by KAB2. Table 12.11.1 sum-
marizes this series of resu1ts.
TABLE 12.1J.l
EXPECTED VALUES OF MEAN SQUARES IN It. TWO·FACTOR EXPERIMENT
EXPECTED VALUE = PARAMETERS EsTIMATED

Mean Mixed Model


Squar.es Fixed Effects Random Effects A Fixed, B Random

A (12 + nbK,/' (12 + nUdB2 + nbu/ (12 + n(JAi + nbKi


B + naK/ + nCfA, S 2 + naui + nOCfB 2
AB
(12

(/'2 + nI('AB2
q2

11
2
+n(1A.
2
(12

(12 + naAs
,
Error q' q' q'

Note that when B is random and the main effects of A are 0 (K/ or
cr/ ,=.0), the mean square for A is an unbiased estimate of cr 2 + ncrAP 2 .
It follows that the appropriate denominator or "error" for an F-test of
the main effects of A is the AB Interactions mean square, as illustrated
from our sample of two fields. When B is fixed, the appropriate de-
nominator is the Error mean square in table 12.11.1.
General rules are available for factors A, B, C, D, ... at levels
a, b, c, d, ... with n replications of each treatment combination. Any
factors may be fixed or random. In presenting these rules, the symbol U
denotes the factorial effect in whose mean square we are interested (for
instance, the main effect of A, or the BC interaction, or the ACD iJlter-
action).
Rule 12.1 1.1. The expected value of the mean square for U contains
a term in (12 and a term in (1/. It also contains a variance term for any
interaction in which (i) all the letters in Uappear, and (ii) all the other letters
in the interaction represent random effects. "-
Rule 12. I 1.2. The coefficient of the term in cr 2 is I. The coefficient
of any other variance is n times the product of all letters a, b, c, ... that
do not appear in the set of capital letters A, B, C, ... specifying the
vanance.
For example, consider the mean square for C in a three-way factorial.
If A and B are both random,
E(C) = (J2 + nUA.B/ + nbuAC 2 + naa Be 2 + nabuc2
2 2
If A is fixed but B is random, the terms in G A Be and O"AC drop out by
Rule 12.11.1, and we have
E(C) = (J' + "au.,.' + nabcrc'
368 Chapt.r 12: Factorial aperi_.ti
If A and B are both fixed, the expected value is
E(C) = a' + nab<Tc'
For main effects and interactions in which aU factors are fixed, we have
followed the practice of replacing a 2 by K2, Most writers use the symbol
11' in either case. Table 12.11.2 illustrates the rules for three factors.

TABLE 12.11.2
EXPECTED VALUES Of MEAN SoUAJWi, IN A TliItEE-WAY FACTOllIAJ.

Exp:cted Values
--------
Mean Squares All Effects Fixe<i A.1I Effects Random
- . - - - f - - - - - -2 - - -1 , - -1- - , - -
A qJ + nb('K,/ (12 +'11(f1l1lC + nCtTA• + nMAC ..r ",,'':(/',/
B cr + nO{,Kl (11 + nC1 l1 sc 2 + ncalla' + naa BC2 + nacul
C (72 + nab-,.:/ (12 + nO'AB/ + nbG,(/ + nOOK l + nabu/
AB a~ + lH'I\:.4B 1 , (Jl + nO-II se2 + ncO"A,/
AC (J2 + nbl(A./ (12 + m7 l1 s / + nbit../
BC ([2 + naKac1 (12 + nO'A8(.J + nad s/
ABC (/1 + m<:AJI("l ql + n(JAsc1

Error (1l (11

===---=+-=====~====================
Mean Squares A Fixed, Band C Random

A a
1
+ nUAB/ + nCO'AB
2
+ nlxTAC 1 + nbCK,tl
B (12 + nQa,,/ + nacai
C ql + na(Ja/ + nahC1c 1
AB (/1 + M .../ + ncaA • J
AC (12 + nGABe l + nbaAC:Z
Be a 1 + notJ,,/
ABC u 2 + Ita ABCl'
Error .'
- --_._=-----_--------

From these formulas, unbiased estimates of all the components of


variance Can be obtained as. linear combinations of the mean squares in
the analysis of variance. The null hypothesis that any component is 0
can be tested, though complications may arise. Consider the null hy-
pothesis 0'/ = O. Table 12.11.2 shows that if all effects are fixed, the
appropriate denominator for testing the mean square for C is the ordi-
nary Error mean square orthe experiment. If A is fixed and B is random,
the appropriate denominator is the BC mean square.
If all effects are random, no single mean square in the analysis of
variance is an appropriate denominator for testing t1C 2 (check with table
12.11.2). An approximate F-test is obtained as follows (II, 12). If
O'c' = 0, you may verify from table 12.11.2 th.at
E(C) = E(AC) + E(BC) - E(ABC)
369
while if uc' is large, E(C) will exceed the right-hand side. A test criterion is
F' = {(C) + (ABC)}/{(AC) + (BC)}
where (C) denotes the mean square for C, and so on. The approximate
degrees of freedom are
{(C) + (ABC)),
n1 = (C)' (ABC)'
-+'-,-'-
Ie IABC
{(AC) + (BC)),
n~ =
(AC)' (BC)'
--+--
IAC IBe
12.12-The split-plot Or nested design. It is often desirable to get pre-
cise information on one factor and on the interaction of this factor with a
second, but to forego such precision on the second factor. For example,
three sources of vitamin might be compared by trying them on three males
ofthe same litter, replicating the experiment on 20 litters. This would be a
randomized blocks design with high precision, providing 38 degrees of
freedom for error. Superimposed on this could be some experiment with
the litters as units. Four types of housing could be tried, one litter to each
type, thus allowing 5 replications with 12 degrees of freedom for error.
The main treatments (housings) would not be compared as accurately as
the sub-treatments (sources of vitamin) for two reasons; less replication
is provided, and litter differences are included in the error for evaluating
the housing effects. Nevertheless, some information about housing may
be got at little extra expense, and any interaction between housing and
vitamin will be accurately evaluated.
In experiments on varieties or fertilizers on small plots, cultural prac-
tices with large machines may be tried on whole groups of the smaller
plots, each group containing all the varieties. Qrrigation is one practice
that demands large areas per treatment.) The series of cultural practices
i. usually replicated only a small number of times but the varieties are
repeated on every cultural plot. Experiments of this type are called
split-plot, the cultural main plot being split into smaller varietal sub-plots.
This design is also COmmon in industrial research. Comparisons
among relatively large machines, or comparisons of different conditions
of temperature and IlUmidity under which machines work, are main plot
treatments, while adjustments internal to the machines are sub-plot treat-
ments. Since the word plot is inappropriate in such applications, the
designs are often called nested, in the sense of section 10.16.
The essential feature of the split-plot experiment is that the sub-plot
treatments are not randomized over the whole large hlock but only over
the main plots. Randomization of the sub-treatments is newly done in
each main plot and the main treatments are randomized in the large blocks.
310 Chapter 12: Foe,.,..i,,1 Experiment.

M
,
0
A
-•
a
C
0
"...J,
c
C B
1-1 B
~ '"" A

".,
0
-,•
a
c
B
A
0
"<>
0
~

,
u
••
B
A
0
...J 0
...J
CD '" C
A
CD
<> C
A
M
u M
C C
"•• 0 "...J", B
"
<> B 0

FIG. 12. 12. I-First 2 blocks of split~plot experiment on alfalfa, illustrating random
arrangement of main and sub-plots.

A consequence is that the experimental error for sub-treatments is dif-


ferent (characteristically smaller) than that for main treatments.
Figure 12.12.1 shows the field layout ofa split-plot design with three
varieties of alfalfa, the sub"treatments being four dates of final cutting (13).
The first two harvests were common 'to all plots, the second on July 27,
1943. The third harvests were: A, none; D, September I ; C, September 20;
D, O~tober 7. Yields in 1944 ore recorded in table 12.12.1. Such an ex-
periment is, of course, not evaluated by a single season's yields; statistical
methods for perennial crops are discussed in section 12.14.
In the analysis of variance the main plot analysis is that of random-
ized blocks with three varieties replicated in six blocks. The sub-plot
analysis contains the sums of squares for dates of cutting, for the date x va-
riety interactions, and for the sub-plot error, found by subtraction as
shown at the foot of table 12.12.2.
The significant differe~es among dates of cuaing were not unex-
pected, nor were the smaller yields following D and C. The last harvest
should be either early enough to allow renewed growth and restoration of
the consequent depletion of root reserves, or so late that no growth and
depletion will ensue. The surprising features of the experiment were two;
the yield following C being greater than D, since late September is usually
considered a poor time to cut alfalfa in Iowa; and the absence of inter-
action between date and variety-Ladak is slow to renew growth after
cutting and might have reacted differently from the other varieties.
In order to justify this analysis we need to study the model. In
randomized blocks, the model for the split-plot or nested experiment is
XiiI! = J1 + M; + B j + Eij + Tit + (MT)iI, + ~ijk
i = I ... m. j = 1 ... b, k = I ... t, <u = .1'(0, <1",), J'j' = . 1'(0, <1,)
Here, M stands for main plot treatments, D for blocks, and Tfor sub-plot
treatments.
371
TABLE 12.12.1
YIII1..D5 OF THIlEE V AJUE11ES OF ALFALFA (TONS PElt ACllE) IN 1944 FouowlNG
FOUll DATI!S OF FINAL CUMlNa IN 1943

Blocks
Variety Date I 2 3 4 5 6

Ladak A 2.17 1.88 1.62 2.34 1.58 1.66


B 1.58 1.26 1.22 1.59 1.25 0.94
C 2.29 1.60 1.67 1.91 1.39 1.12
D 2.23 2.01 1.82 2.10 1.66 1.10

8.27 6.75 6.33 7.94 5.88 4.82

Cooaack A 2.33 2.01 1.70 1.78 1.42 1.35


B 1.38 1.30 1.85 1.09 1.13 1.06
C 1.86 1.70 1.81 1.54 1.67 0.88
D 2.27 1.81 2.01 1.40 1.31 1.06

7.84 6.82 7.37 5.81 5.53 4.35

Ranger A 1.75 1.95 2.13 1.78 1.31 1.30


B 1.52 1.47 1.80 -1.3, - -[or-- 1.31
c 1.55 1.61 1.82 1.56 1.23 1.13
D 1.56 1.72 1.99 1.55 1.51 1.33

6.38 6.75 7.74 6.26 5.06 5.07

Total 22.49 20.32 21.44 20.01 16.47 14.24

Date of Cutting
Variety A B C D Total

Ladak 11.25 7.84 9.98 10.92 39.99


Cossack 10.59 7.81 9.46 9.86 37.72
Ranger 10.22 8.48 8.90 9.66 37.26

Total 32.06 24.13 28.34 30.44 114.97

Mean (tons per acre) 1.78 1.34 1.57 , 1.69

The symbols i,j identify the main plot, while k identifies the sub-plot
within the main plot. The two components of error, £;j and 0;).. are needed
to make the model realistic: the sub-plots in one main plot often yield
consistently higher than those in another, and £;; represents this difference.
From the model, the er:Of of the mean difference between two main
plot treatments, say M, and M" is

',. - '" + "'" - 0,,,


The e's are averages over b values, the o's over bl values. Consequently,
the variance of \he mean difference is
312 Cltapter 12: F«IorioI &,_;m• .".
TABLE 12.12.2
ANALYSIS Of V AltI ....:NCE OF SPLIT-PLOT ExPEIUMENT ON ALfALFA

Source of Variation ! Degrees of Freedom Sum of S~uares Mean Square

Mainplots~
Varieties; 1 2 0.1781 0.0890
Blocks 5 4.1499 0.8300
Main plot error 10 1.3622 0.1362

Sub-plots:
Dates of cutting 3 1.9625 0.6542··
Date x variety 6 0.2105 0.0351
Sub-piot error 45 ]'2586 0.0280

I. Correction: C = (114.97)'/72 = 183.5847


2. TOIBI: 12.17)' + ... + (1.33)' - C = 9,1218
. (8.27)' + , . , + (5.07)'
3. Mam plots: 4 C == 5.6902

. . . (39.99); ;+ . :. + (3;.26)1 _ r _;;: 0 1781


4. VanetleS, 24 - -,_

(22.49)' + . , . + (14,24)' C 4 I 99
5. Blo<ks: 12 - =. 4

6. Main plot error: 5.6902 _. (0.1781 + 4,1499) =.1.3622


(I I.~S)' + ... + (9.66)'
7. Sub-classes in variety~ate table: 6 C = 2.3511

(32,06)' + .. ' + (30.44)' C = 1,9625


8. Dates: IS

9. Date x variety: 2.3511 - (0.1781 + 1.9625) - 0.2105


10. Sub-plot error: 9.1218 - (5.6902 '" 1.9625 + 0.2105) - 1.2586

tI)I 1
2( -
b
+ -(11bt' ) = -2 (171
bt
l
+ ta..Z )
In the analysis of variance, the main plot Error mean square estimates
(17/ + taM').
Consider now the difference X'i! - XIj, between two sub-plots that
ire in th~ :ame main plot. According to the model.
X'it - Xii' = T, - T, + (MT)" - (MT)" + J i;, - J'l'
The error now involves only the J's. Consequently, for any comparison
among treatments that is made entirely within main plOIS, the basic error
variance is ",', eslimated by the sub-plot Error mean square. Such com-
parisons include (i) the main effects of sub-plot Irealmt;tlts, (ii) interac-
tions between main-plot and sub-plot treatments, and (iii) comparisons
373
between sub-plot treatments for a single main-plot treatment (e.g., be-
tween dates for Ladak).
In some experiments it is feasible to use either the split-plot design
or ordinary randomized blocks in which the ml treatment combinations
are randomized within each block. On the average, the two arrangements
have the same overall accuracy. Relative to randomized blocks, the split-
plot design gives reduced accuracy on the main-plot treatments and in-
creased accuracy on sub-plot treatments and interactions. In some in-
dustrial experiments conducted as split-plots, the investigator apparently
did not realize the implications of the split-plot arrangement and analyzed
the design as if it were in randomized blocks. The consequences were to
assign too low errors to main-plot treatments and too high errors to sub-
plot treatments.

TABLE 12.12.3
PR.f.sENTATION OF TREATMENT MEANS (TONS PER. ACRE) AND STANDARD EIlRORS

Dille of Cutting (±.JEJb - :t 0.0683)


Variety ABC D Means

Ladak 1.875 1.307 1.664 1.820 1.667 (±.jEJtb-


Cossack 1.765 1.302 1.577 1.644 1.572 ± 0.0753)
Ranger 1.704 1.414 1.484 1.610 1.553

Means I. 781 1.341 1.575 1.691


(l:.J E.Jmb - :t 0.0394)

Care is required in the use of the correct standard errors for com-
parisons among treatment means. Table 12.12.3 shows the treatment
means and s.e.'s for the alfalfa experiment, where E. = 0.1362 and
E, = 0.0280 denote the main- and sub-plot Error mean squares.
The S.e. ±O.0683, which is derived from E" is the basis for computing the
s.e. for comparisons that are part of the Variety-Date interactions and for
comparisons among dates for a single variety or a group of the varieties.
The s.e. ±0.0753 for varietal means is derived from E.. Some compari-
sons, for example those among varieties for Date A, require a standard
error that involves both E. and E" as described in (8).
Formally, the sub-plot error S.S. (45 df.) is the combined S.S. for the
DT interactions (15 d!) and the DMT interactions (30 df). Often, it is
more realistic to regard Blocks as a random component rather than as a
fixed component. In this case, the error for ~ testing T is the DT mean
square, while that for testing MTis the DMTmean square, if the two mean
squares appear to differ.
Experimenters sometimes split the sub-plots and even the sub-sub-
plots. The statistical methods are a natural extension of those given here.
If T" T" T, denote the sets of treatments at three levels, the set T, are
tested against the main-plot Error mean square, T, and the T, T, interac-
374 C,"",Ier 12: Factorial Experimellls
lions against the sub-plot error. and T,. T,T,. T2 T,. and T,T,T, against
the sub-sub-plot error. For missing data see (8. 14).
EXAMPLE 12.12.I-A split-split-plot experiment on com was conducted to try 3 rates
of planting (stands) with 3 levels of fertilizer in irrigated and Don-irrigated plots (21). The
design was randoinized blocks with 4 replications. The main plots carried the irrigation
treatments. On each there were sub--plots with 3 stands. 10,000, 13,000, and 16,000 plants
per acre. Finally, each su})..plot was divided into 3 parts respectively fertilized with 60,120,
and 180 pounds of nitrogen. The yields are in bushels per acre. Calculate the analysis of
variance.

Blocks

1 2 3 4

Not Irrip"'" Stood 1 Fertilizer 1 90 83 85 86


2 95 80 88 78
3 107 95 88 89
2 1 92 98 112 79
2 89 98 104 86
3 92 106 91 87
3 1 81 74 82 85
2 92 81 78 89
3 93 74 94 83
Irripled 1 1 80 102 60 73
2 87 109 104 114
3 100 105 114 114
2 1 121 99 90 109
2 110 94 118 131
3 119 123 113 126
3 1 78 136 119 116
2 98 133 122 136
.... 3 122 132 136 133
\
Source of Variation Degrees of Freedom Mean Square

Main Plots:
Blocks 3
Irrigation, I 1 8.277.56
Error (a) 3 470.59
Sub-plots:
Stand, S 2 879.18
IS 2 1.373.51·
Error (b) 12 232.33
Sub-sub-plots:
Fertilizer, F 2 988.72
IF 2 476.72"
SF 4 76.22
ISF 4 58.68
Error (c) 36 86.36
315
EXAMPLE 12.12.2~Attention is attracted to the two siRnitkant interactions,lS Ilnd
IF. Now, 15Fis less than error. This means that the IS interaction is much the same at all
levels of F; or, aHemalively, that the IF interaction is similar at an levels of S. Hence, each
2-way table gives information.

F, F, F, S, S, S,
Not Irrigated 1,041 1,058 1,099 1,064 1,134 1,006
Irrigated 1,183 1,356 1,437 1,162 1,353 1,461

Neither fertilizer nor stand affected. yield materially on the non-irrigated plots. With
irrigation. the effect of each was pronounced. So it is necessary to examine separattly the
split-plot experiment on the irrigated plots. Verify the following mean squares:
Stand:
Linear 1 3.725"
Deviations 1 96
Error (a) 6 316

Fertilizer:
Linear 1 2.688··
Deviations 1 118
SF 4 92
Error (b) 18 137

EXAMPLE 12.12.3-Notice that the planting and fertilizer rates were wen chosen for
the unirrigated plots, but on the irrigated plots they were too low to allow any evaluation
of the optima. This suggests that irrigation should not be a factor in such experiments.
But in order to compare costs and returns over a number of years, two experiments (one with
and one without irrigation) should be randomly interplanted to control fertility differences.

12.13--Serles .if experimdds. A series of experiments may extend


over several places or over several years or both. In a number of COUn·
tries in which the supply of food is deficient, such series have been under·
taken in recent years on fanners' fields in order to estimate the amount
by which the production of food grains can be increased by greater use of
fertilizers.
Every series of experiments presents a unique .problem for the ex·
perimenter and the statistician, both in planning and analysis. Good
presentations of the difficulties involved are in (15, 16, 17, 18), with illus-
trations of the analysis. The methods given in this book should enable
the reader to follow the references cit~d. Only a brief introduction to the
analysis for experiments conducted at a number of places will be given
here.
We suppose that the experiments are all of the same size and structure,
and that the places can be regarded as a random sample of the region about
which inferences are to be made. For many reasons, a strictly random
sample of places is difficult to achieve in practice: insofar as the sample
is unrepresentative, inferences drawn from the analysis are vulnerable to
bias .
. In the simplest case, the important terms in a combined analysis of
vanance are:
376 CltGpler 12: Facforial Expcrimeals
Treatments
Treatments x Places
Pooled experimental errors
The Treatments x Places mean square is tested against the pooled error
(average of the Error mean squares in the individual experiments). If F
is materially greater than I, indicating that treatment effects change from
place to place, the Treatments mean square is tested against the Treat-
ments x Places mean square, which becomes the basic error term for
drawing conclusions about the average effects of treatments over the
region.
Two complications occur. The experimental error variances often
diffp-r from place to place. This can be checked by Bartlett's test for
homogeneity of variance. If variances are heterogeneous, the F-test of
the Treatments x Places interactions is not strictly valid, but an adjusted
form of the test serves as an adequate approximation (IS, 17). If com-
parisons are being made over a subset of the places, as suggested later,
the pooled error for these places should be used instead of the overall
pooled error.
Secondly, the Treatments x Places interactions may not be homo-
geneous, especially in a factorial experiment. Some factors may give
stable responses from place to place, while others are more erratic in their
performance. If the Treatments mean square has been subdivided into
sets of comparisons, the Interactions mean square for each set should be
computed and tested separately.
The preceding approach is appropriate where the objective is to
reach a single set of conclusions that apply to the whole region. Some-
times there is reason to expect that the relative performances of the treat-
ments will vary with the soil type, with climatic conditions within the
region, or with other characteristics of the places. The series may have
been planned so as to examine such differences, leading perhaps to dif-
ferenl recommendations for different parts of the region. In the analysis,
the places then subdivide into a number of sets. The Treatments x Places
interactions are separated into
Treatments x Sets
Treatments x Places within sets
If the Treatments x Sets mean square is substantially larger than Treat-
ments x Places within sets, it is usually advisable to examine the results
separately for each set.
The following examples illustrate the preliminary steps in the analy-
sis of one series of experiments.
EXAMPLE 12.13.1-Tbe foUowing data illustrate a series of experiments over five
places (21). four freated lots of 100 Mukden soybean seeds, together with one lot untreated,
were planted in 5 randomized blocks at eacb participating station. The total numbers of
emerging plants (from 500 seeds) arc shown for the 5 locations. Also shown art the analyses
of variance at the several stations.
377
NUlOII!Il Of EMERGING PL.,\NTS (300 Sm!Ds) IN FIvE Ptors. CooPERATIVE SEJ5D
'fIl£....lMEJrrlT litlltJ..S WITH MUKDEN SoYBEANS, 1943

Location UD_ted A ...... SpergoD Scmcsan, If. Fermate Total

Micbigan J6() 356 362 350 373 1.801


Minnesota 302 354 349 332 332 1.669
Wisconsin 408 <IIl7 391 391 409 2.()(16
Virgioia 244 267 293 235 278 1.317
Rhode Island 373 387 406 394 375 1.935
-
Total 1.687 1.711 1.801 1.702 1.767 8.728

Mean Squares From Original Analyses ofVarianct

Location
Sourceo! Degrees of
Variation Freedom Michigan Minnesota Wisconsin Virginia Rhode Island

Treatments 4 14.44 82.84· 17.44 114.26' 37.50


Blocks 4 185.14 54.64 5.64 70.76 4.80
Error 16 42.29 26.67 30.64 26.34 13.05

Test the hypothesis of homogeneity of error variance. AlII. Corrected Xl _ 5.22. dj. = 4.
EXAMPLE 12.I3.2-For the entire soybean data, awya the variance as (ollC?ws:

Source of Variation Degrees of Freedom Sum of Squares M_nSquare


Treatments 4 380.29 95.07
Locations 4 11.852.61 2.963.15
Interaction 16 685.63 42.85
Blocks in Locations 20 1.283.92 .....
Experimental Error 80 2.223.68 27.80

Blocks and Experimental Error are pooled values from the analyses of the five places.
EXAMPLE 12.13.3-I5Olate the sum of squares for the planned comparison. Un-
treated vs. Average of the four Treatments. Ans. 111.70, F = 4.01, F.05 = 4.49.

12.14-Experiments with perennial crops. When a perennial crop is


investigated over a number of years, the yields from the same plot in suc-
cessive years are usually correlated. The experimental error in one season
is not independent of that in another season.
In comparing the overall yields of the treatments. this difficulty is
overcome by first findingfor each plot the total yield over all years. These
totals are analyzed by the method appropriate to the design that was used.
This method provides.a valid error for testing the overall treatment effects.
For illustration. the data in table 12.14.1 are taken from an experi-
ment by Haber (19) to compare the effects of various cUlting treatments on
asparagus. Planting was in 1927 and CUlling began in 1929. One plot
in each block was cut until June I in each year. others to June 15, July I.
and July 15. The yields are for the four succeeding years_ 1930. 1931.
378 Chapter 12: 1'GcIoIiaI &".,ri"",nts
1932, and 1933. The yields are the weights cut to June I in every plot,
irrespective of later cuttings in some of thern. This weight is a measure
of vigor, and the objective is to compare the relative effectiveness of the
different harvesting plans.
A glance at the four-year totals (5,706; 5,166; 4,653; 3,075) leaves
!it.tle doubt that prolonged cutting decreased the vigor. The cutting totals
were separated into linear, quadratic, and cubic compnnents of the regres-

TABLE 12.14.1
WEIGHT (OUNcEs) OF AsPARAGUS CUT BEFORE JUNE 1 FROM PLoTs WITH
V AIlIOUS ClrrnNo TREATMENTS

Cutting Ceased
Blocks Year June I June IS July 1 July 15 Total

1 1930 230 212 183 148 773


1931 324 415 320 246 1,305
1932 512 584 456 304 l,g56
1933 399 386 255 144 1,184
--
1,465
--
1,597
--
1,214
-
842
--
5,118

2 1930 216 190 186 126 718


1931 317 296 295 201 1,109
1932 448 471 387 289 1,595
1933 361 280 187 83 911
--
1,342
--
1,237
--
1,055
-
699
--
4,333

3 1930 219 151 177 107 654


1931 357 278 298 192 1,125
1932 496 399 427 271 1,593
1933 344 254 239 90 927
-- -- -- - --
1,416 1,082 1,141 660 4,299

4 1930 200 150 209 168 727


1931 362 336 328 226 1,252
1932 540 485 462 312 1,799
-
1933 381 279 244 168 1,072
-- ---_ -- - --
1,483 1,250 1,243 874 4,850

Total 5,706 5,166 4,653 3,075 18,600

Degrees offreedom Sum of Squares Mean Square

BJoc\:s 3 30,170
Cuttings: (3) (241,377)
Linear 1 220,815"
Qua4.ratic I 16,835*
Cubic I 3,727
Error 9 2,429
_.
379
sion on duration of cutting. The significant quadratic component indi-
cates that the yields falloff more and more rapidly as the severity of
cutting increases.
Such experiments also contain information about the constancy of
treatment differences from year to year, as indicated by the Treatments x
Years interactions. Often it is useful to compute on each plot the linear
regression of yield on years, multiplying the yields in the four years by
- 3, - I, + I, + 3 and adding. These linear regressions (with an appro-
priate divisor) measure the average rate of improvement of yield from
year to year. An analysis of the linear regressions for the asparagus data
appears in table 12.14.2. From the totals for each treatment it is evident
that the improvement in yield per year is greatest for the June I cutting,
and declines steadily with increased severity of cutting, the July 15 cutting
showing only a modest total, 119.

TABLE 12.14.2
ANALYSIS OF THE LINEAIl REoJU!SSION OIl YIIILD ON YEARS

Cutting Ceased
Blocks June I June 15 July I July 15 Total

I 695· 691 352 46 1,784


2 S66 445 95 -41 1,065
3 514 430 315 28 1,287
4 721 536 239 86 1,582

Total 2,496 2,102 1,001 119 5,718

Degrees of Freedom Sum of Squares Mean Square

Blocks 3 3,776
Cuttings: (3) 43,633 14,544··
Linear I 42,354"
Quadratic '[ 1 744
Cubic 1 536
Error 9 2,236 248
• 695 - 3(399) + 512 - 324 - 3(230), from table 12.14.1.

In the analysis of variance of these linear regression terms, the sum


of squares between cuttings has been subdivided into its linear, quadratic,
and cubic regression on duration. Only the linear term was strongly
significant. Evidently, each additional two weeks of cutting produced
about the same decrease in the annual rate of improvement of yield.
In this analysis of variance an extra divisor 20 = 32 + 12 + 12 + 32
was applied to each sum of squares, in order that the mean squares refer
to a single observation. Can you explain why the Error mean square, 248,
is so much smaller than the Error mean square for the four-year totals,
2,429? Features of this experiment have been discussed by Snedecor
and Haber (19, 20).
380 CItapr... 12: factorial ex,..,rimenll
REFERENCES
1. R. A. FISHEll The Design of Experiments. Oliver an<! Boyd, Edinburgh (1935-1951).
2. F. YATES. J. R. Statist. Soc. Supp., 2:210 (1935).
3. Southern Cooperative Series BuUetlD No. \0, p. i\4 (195\).
4. Rothamsted Experimental Station Report: 218 (1937).
5. A R &UNDERS. Union of Soulb Africa Dept. of AgricuJturc and Forestry Sci. But
No. 200 (1939).
6. G. W. SNEVECOR. Pr()C./nt, Statist. Conferences, 3:440 (1947).
7. O. L. DAVIES (ed.). Design and Analysis of Indus'tiill Experimellls, 2nd ed. Oliver and
Boyd, Edinburgh (1956).
8. W. G. COCHRAN and G. M. Cox. Experimental Designs. Wiley, New York (1957),
9. Iowa Agricultural Ex.periment Station, Animal Husbandry Swine Nutrition Experi-
ment No. 577 (1952).
10, F. YATES. "The Design and Analysis of Factorial Experiments," Commonwealth
Bureau olSoiJ Science Tech. Comm. 35 (1937).
11. W.O.CocHRAN. Biometrics, 7:17(1951).
12. F. E. SATTfRTHwArTf. Biometrics But., 2: J 10 (1946).
13. c. P. WILSIE. Iowa State College Agricultural Experiment Station (1944).
14. R. L. ANDEkSON. Biometrics But., 2:41 (1946).
15. F. YATES and W. G. COCRRAN. J. Agric. Sci., 28: 556 (1938).
16. O. KEMPTHORNE. The Design and Analysis of Experiments. Wiley, New York (19~2).
17. W. G. COCHRAN. Biometrics, 10:101 (1954).
18. F. YATES, S. LIPTON, P. SINHA, and K . P. DASGUPTA. Emp. J. Exp. Agric., 27:263
(1959).
19. E. S. HABER and G. W. SNEDECOR. Amer. Soc. Hort. Sci., 48:481 (1946).
20. G. W. SNEDEf':OR and E. S. HAbER. Biometrics Bul., 2:61 (1946).
21. R. H. porter. Cooperative Soybean Seed Treatment Trials, Iowa State University
Seed Laboratory (1943).
* CHAPTER THIRTEfN

Multiple regressIOn

13.I-Introduction. The regression of Yon a single independent vari-


able (chapter 6) is often inadequate. Two or more X's may be available
to give additional information about Y by means of a multiple regression
on the X's. Among the principal uses of multiple regression are:
(I) Constructing an equation in the X's that gives the best prediction
of the values of Y.
(2) When there are many X's, finding the subset that gives the best
linear prediction equation. In predicting future weather conditions at an
airport, there may be as many as 50 available X-variables, which measure
different aspects of the present weather pattern at neighboring weather
stations. A prediction equation with 50 variables is unwieldy, and is un-
wise if many of the X-variables contribute nothing to improved accuracy,
in the prediction. An equation hased on the best three or four variables
might be a wise choice.
(3) In some studies the objective is not prediction, but instead to
discover which variables are related to Y, and, if possible, to rate the
variables in order of their importance. . "-
Multiple regression is a complex subject. The calculations become
lengthy when there are numerous X-variables, and it is hard to avoid mis-
takes in computation. Standard electronic computer programs, now be-
coming more readily available, are a major help. Equally important is
an understanding of what a multiple regression equation means and what
it does not mean. Fortunately, much can be learned about the basis of
the computations and the pitfalls in interpretation by study of a regression
on two X-variables, which will be considered in succeeding sections before
proceeding to three or more X-variables.
13.2-Two independent variables. With only one X-variable, the
sample values of Y and X could be plotted as in figures 6.2.1 and 6.4.1,
which show both the regression line and the distributions of the inaividual
values of Yabout the line. But if Y depends partly on X, and partly on
381
382 Chapter 13: Multiple Regression
X, for its value, solid geometry instead of plane is required, Anyobserva-
tion now involves three numbers-the values of Y, X" and X" The pair
(XI' X,) can be represented by a point on graph paper. The values of Y
corresponding to this point are on a vertical axis perpendicular to the
graph paper. In the population these values of Y form a frequency dis-
tribution, so we must try to envisage a frequency distribution of Yon
each vertical axis. Each frequency distribution has a mean-the mean
value of Yfor specified X" X,. The surface determined by these means is
the regression surface. In this chapter the surface is a plane, since only
linear regressions on X, and X, are being studied.
The popUlation regression plane is written
YR =
+ p,X, + p,X
11.
"
where Y. denotes the mean value of the frequency distribution of Y for
specified X" X,. In mathematical notation, Y. = E(YIX" X,).
What does PI measure? Suppose that the value of X, increases by 1
unit. while the value of X, remains unchanged. Y. becomes
YR ' = 11. + p,X, + p, + p,X, = YR + p,
Thus, PI measures the average or expected change in Y when Xl increases
by 1 unit, X, remaining unchanged. For this reason P, is called the partial
regreSSion coefficient of Yon X,. Some writers use a more explanatory
symbol P"., for p" the subscript'2 being a reminder that X, also ap-
pears in the regression equation.
For given X" X" the individual values of Yvary about the regression
plane in a normal distribution with mean 0 and variance a 2 , sometimes
denoted by O'y.,,'. Hence. the model is
Y= 11. + p,X, + p,X, + e, e = %(0,0') (13.2.1)
Given a sample of n values o(,J Y, X" X,) the sample regression-the pre-
diction equation-is
(13.2.2)
The values of a, b,. and b, are chosen so as to minimize E(Y - fj2, the
sum of squares of the n differences between the actual and the predicted
Y values. With our model, theory shows that the resulting estimates a,
b" b" and Y are unbiased and have the smallest standard errors of any
unbiased estimates that are linear expressions in the Y's. The value of
a is given by the equation
(13.2.3)
By substitution for a in (13.2.2) the fitted regression can be written
Y = Y + b,x, + b,x" (13.2.4)
where x I = X, - X\, as usual.
383
The b's satisfy the normal equations:
b,"l:x,' + b,"l:x,x, = "l:x,y (13.2.5)
b,"l:x,x, + b,"l:x/ = "l:X,y (13.2.6)
Solution of these equations by standard algebraic methods leads (0 the
formulas :
b, = ("l:x/)("l:x,y) - ("l:X,X,)("l:X,y)
(13.2.1)
D
and
b, = CEx/)(I:x,y) - (I:X,X,)(I:X,y),
(13.2.8)
D
where
(13.2.9)
The illustration (table 13.2.1) is taken from an investigation (I) of
the source from which corn plants in various Iowa soils obtain their
phosphorus. The concentrations of inorganic (X,) and organic (X,)
phosphorus in the soils were determined chemically. The phosphorus
content Y of corn grown in the soils was also measured.
The familiar calculations under the table give the sample means and
the sums of squares and products of deviations from the means. Substi-
tution in (13.2.7) to (13.2.9) gives
D = (1,752.96)(3,155.78) - (1,085.61)' = 4,353,400
h, = (3,155.78)(3,231.48) - (1,085.61)(2,216.44) 1.1898
4,353,400
b, = (1,752.96)(2,216.44) - (1,085.61)(3,231.48) = 0.0866
4,353,400 ,
From (13.2.3), a is given by
a = 81.28 - (1.7898)(11.94) - (0.0866)(42.11) = 56.26
The multiple regression equation becomes
Y= 56.26 + 1.7898X, + 0.OS66X, (13.2.10)
The meaning is this: For each additional part per million of inorganic
phosphorus in the soil at the beginlling of the growing season, the phos-
phorus in the corn increased by 1.7898 ppm, as against 0.0866 ppm for
each additional ppm of organic phosphorus. The suggestion is that the
inorganiC phosphorus in the soil was the chief source of plant-available
phosphorus. This deduction needs further consideration (sections 13.3
and 13.5).
384 Chopl.r 13: Multiple Regression
TABLE 13.2.1
INORGANIC PHOSPHORUS Xl' ORGANIC PHOSPHORUS Xl' AND EsTIMATED PLANTrAVAtLABLE
PHOSPHORUS YIN JB IOWA SoILS AT 20" C. (PARTS PER. MrLLloN)

Soil Sample X, X, Y l' Y-1'


I 0.4 53 64 61.6- 2.4·
2 0.4 23 60 59.0 1.0
3 3.1 19 71 63.4 7.6
4 0.6 34 61 60.3 0.7
5 4.7 24 54 66.7 -12.7
6 1.7 65 77 64.9 12.1
7 9.4 44 81 76.9 4.1
8 10.1 31 93 77.0 16.0
9 11.6 29 93 79.6 13.4
10 12.6 5& 51 83.& -32.8
II 10.9 37 76 79.0 - 3.0
12 23.1 46 % 101.6 - 5.6
13 23.1 50 77 101.9 -24.9
14 21.6 44 93 98.7 - 5.7
15 23.1 56 95 102.4 - 7.4
16 1.9 36 54 62.& - 8.8
17 26.8 58 168 109.2 58.&
i8 29.9 51 99 114.2 -15.2

Sum 2i5.0 758 1,463 1,463.0 0.0

Mean. 11.94 42.11 81.28

I. X! z "'" 4,321.02 :EX,X,= 10,139.50 IX1Y"", 20,706.20


c= 2,568.06 c= 9.053.89 c= 17,474.72

I:x 1 1 = 1,752.% EXjXl = 1.085.61 Ix1y,= 3,231.48

:EX,' = 35,076.00 :EX, Y = 63.825.00 :EY' = 131,299.00


C = 31,920.22 C = 61,608.56 C = 118,909.39
Lxl= 3,155.78 :Ex,y '" 2,216.44 I,yl "" 12,389.61

'to The number of significant digits retained in the preceding calculations will affect these

columns by iO.l or iO.2.


From the filled regression (equation 13.2.10), the predicted value f
can be estimated for each soil sample in table 13.2.1 For example, for
soil I,
Y= 56.26 + 1.7898{O.4) + 0.0866(53) = 61.6 ppm
The observed value Y = 64 ppm deviates by 64 - 61.6 = + 2.4 ppm from
the estimated regression value. The 18 values of t are recorded in table
13.2.1. The deviations Y - t are in the final column; they measure the
failure of the X's to predict Y.
The investigator now has the opportunity to examine the deviations
from regression. In part they might be associated with other variables not
included in the study. Or some explanation might be found for certain
385
deviations-<lspecially the larger ones. Such explanation might be a valu·
able finding of the analysis, providing clues for further experimentation,
or it might lead to the rejection of one or more observations and to a
recalculation of the regression. In the present example the results for soil
17 immediately strike the eye. This soil has much the highest value of Y,
168. Before the regression was calculated, this value might not seem neces·
sarily out of line (though it should be verified from the records), because
soil 17 has the second highest value of both types of soil phosphorus,
which could account for the high plant phosphorus. But this soil also has
the highest deviation Y - f. = + 58.8. A test of this deviation will be
presented in section 13.5.
A check on the linearity of the regression is made by plotting two
scatter diagrams. First, plot the deviations Y - f against X., then plot
the same deviations against X,. If the regression is markedly non·linear
in one of the %'s, a curve instead of a horizontal straight line should be
detectable in the corresponding graph. For curved multiple regression,
see example 15.5.1.
13.3-The deviations mean squar~ and the F ·test. In the multiple
regression model, tbe deviations of the Y's from the population regression
plane have mean 0 and variance (12 An unbiased estimate of (1' is
s' = ':!;( Y - f)'/(n - k),where n is the size of sample and kis the number
of parameters that have been estimated in fitting the regression. In the
example n = 18 with 3 parameters~, {iI' {i" giving n - k = 15.
Th. deviations sum of squares I:( Y - f)' can be computed in tWJ)
ways. If the individual deviations have been tabulated as in the last
column of table 13.2.1, their sum of squares is run up directly, giving
:!;(Y _ f)' = 6,414.5.
In practice a quicker method, based on an algebraic identity, is used.
From equation (13.2.4) we had
f = Y + blxl + b,x,
Since the sample means of XI and x, are both zero, the sample mean of
the fitted values f is Y. Write p = f - 'I' and d = Y - Y, so that d
represents the observed deviation of Y from the fitted regression at this
point. It follows that

y = Y - 'I' = (f - '1') + (Y - f) = P+ d. (13.3.1)


Two important results, proved later in this section, are. first,
(13.3.2)

This result states that the sum of squares of deviations of the Y's from their
mean splits into two parts: (i) the sum of squares of deviations of the
fitted values from their mean, and (ii) the sum of squares of deviations
from the fitted values. The sum of squares :!;p' is appropriately called
"the sum of squares due to regression." In geometrical treatments of
386 Chapter 13: Multiple Regreaiolt
mUltiple regression, the relation (equation 13.3.2) may be shown to be
an extension of Pythagoras' theorem to more than two dimensions.
The second result, of more immediate interest, is:
S.S. due to regression = :EP' = b,:Ex,y + b,:Ex,y (13.3.3)
Hence, the sum of squares of deviations from the regression may be ob-
tained by subtracting from :Ey' the sum of products of the b's with the
right sides of the corresponding normal equations. For the example we
have,
:EjI' = (1.7898)(3,231.48) + (0.0866)(2,216.44) = 5,975.6
The value of :Ed' is then
:Ed' = :Ey' - :EP' = 12,389.6 - 5,975.6 = 6,414.0
Besides being quicker, this method is less subject to rounding errors than
the direct method. Agreement of the two methods is an excellent check
on the regression computations.
The mean square of the deviations is 6,414.0(15 = 427.6, with 15 df
The corresponding standard error, ./427.6 = 20.7, provides a measure
of how closely the.regression fits the data. If the purpose is to find a more
accurate method of predicting Y, the size of this standard error is of
primary imparlance. For instance, if current methods of predicting some
critical temperature can do this with a standard error of 3.2 degrees, while
a multiple regression gives a standard error of 4.7 degrees, it is obvious
that the regression is no improvement on the current methods, though it
might, after further study, be useful in conjunction with the current
methods.
Sometimes the object of the regression analysis is to understand why
Yvaries, where the X's measure variables that are thought to influence Y
r
through some causal mechanism. For instance. might represent the
yields of a crop grown on the same field for a number of years under uni-
form husbandry, while the X's measure aspects of weather or insect in-
festation that influence crop yields (2). In such cases, it is useful to com-
pare the Deviations mean square, :Ed'(n - k), with the original mean
square of Y, namely 1:y' J(n - 1). In our example the Deviations mean
square is 427.6, while the original mean sq4are is 12.389.61/17 = 728.8.
The ratio, 427.6(728.8 = 0.59, estimates the fraction of the variance
of Y that is not attributable to the multiple regression, While its comple-
ment, 0.41, estimates the fraction that is "explained" by the X-variables.
Even if the regression coefficients are clearly statistically significant. it i5
not uncommon to find that the fraction of the variance of Yattributable
to the regression is much less than 1/2. This indicates that most of the
variation in Y must be due to variables not included in the regression.
In some studies the investigator is not at all confident initially that
any of the Ks are related to Y. In this event an F-test of the null hypothesis
PI = p, = 0 is helpful. The test is made from the analysis of variance in
317
TABLE 13.3.1
ANALYSIS OF V AJUANCE OF PHOSPHOJlUS DATA

Degrees of
Source of Variation Freedom Sum of Squares Mean Square F

Regression 2 l:P' - 5.975.6 2.987.8 6.99"


Deviations 15 I.d 2= 6.414.0 427.6

Total 17 l:y' - 12.389.6 728.8

table 13.3.1. F is the ratio of the mean square due to regression to the
Deviations mean square.
The F-value, 6.99. with 2 and 15 dJ, is significant at the 1% level
By an extension of this analysis, tests of significance of the individua
b's can be made. We have (from table 13.2.1)
I:x l y=3,231.48: I:x I 2 =I,752.96: I:x2Y=2,216.44: Lt/=3,155.78
If we had fitted a regression of Y on Xl alone, the regression coefficient
would be by! = 3,231.48/1,752.96 = 1.8434. The reduction in sum of
squares due to this regression would be (l:x IY )2/l: x " = (3,231.48)2/
(1,752.96) = 5,957.0, with I dJ When both Xl and X 2 were included iQ
the regression, the reduction in sum of squares was 5,975.6, with 2 dJ
(table 13.3.1). The difference, 5,975.6 - 5,957.0 = 18.6, with I dJ, mea-
sures the additional reduction due to the inclusion of X 2 , given that Xl
is already present, or in other words the unique contribution of X2 to
the regression. The null hypothesis P2 = 0 is tested by computing
F = 18.6/427.6 = 0.04, with I and 15 dJ, where 427.6 is the deviations
mean square. The test is shown in table 13.3.2. Since Fis small, die null
hypothesis is not rejected.
Similarly, the null hypothesis PI = 0 is tested by finding the addi-
tional reduction in sum of squares due to the inclusion of Xl in the regres-
TABLE 13.3.2 '
TEST OF EACH X AfTER THE EFFECT OF THE OTHER H"AS BEEN REMOVED

Degrees. of Mean
Source of Variation Freedom Sum of Squares Square F

2 l:P' - 5,975.6
I (I:Xty)2/Ixll "'" 5,957.0

18.6 18.6 0.04

2 l:P' - 5.975.6
I (I.XJ.')2jI:X/ = 1.556.7

4.418.9 4.418.9

Deviations 15 6.414.0 427.6

2S
3" Chapter 13: Multiple 1I_...ion
sion after X, has already been included (table 13.3.2). In this case
F= 10.30 is·significant at the l~~ level.
This method of testing a partial regression coefficient may appear
Mrange at first, but is very general. If fl. = 0 when there are k X-variables,
this means that the true model contains only Xl ... XIt _ l' We fit a regres~
sion on Xl'" X... - 1 • obtaining the reduction in sum of squares, R1 - 1 ,
Then we fit a regression on X , ... X.' obtaining the reduction R.. If
fl, = 0, it can be proved that (R. - R,_ 1) is simply an estimate of 17'. so
that F = (R. - R._ 1 )!s' should be about 1. If. however. fl, is not zero.
the inclusion of X, improves the fit and (R. - R._,) tends to become
large, so that F tends to become large. Later. we shall see that the same
test can be made as a (-test of b le •
Incidentally, it is worth comparing b" = 1.8434 with the value
b, = b Yl ., = 1.7898 obtained when X, is included in the regression. Two
points are important. The value of the regression coefficient has changed.
In multiple regression, the value of any regression coefficient depends on
the other variables included in the regression. Statements made about
the size of a regression coefficient are not unique, being conditional on
these other variables. Secondly, in this case the change is small-this
gives some assurance that this regression coefficient is stable. With Xz.
we have. btl = 2.216.44/3.155.78 = 0.7023, much larger than b, = b Yl "
=0.0866.
The remainder of this section is devoted to proofs of the basic results
(13.3.2) and (13.3.3). Recall that
P ~ l' - Y = b , x , + b,x, : y = .v + d
d = y - b , x , - b,x,
Start with the normal equations:
b I LX , ' + b,l:x , x, = LX , y
··.,],,1:X ,X, + b,LX,' = LX,y
These may be rewritten in the form
LX , (y - b,x , - b,x,) = Lx,d = 0 (13.3.4)
(I3.3.5)
These results show that the deviations d have zero sample correla-
tions with any X-variable. This is not surprising. since d represents the
part of Y that is not linearly related either to X, or to X,.
Multiply (13.3.4) by b , and (13.3.5) by b, and add. Then
(13.3.6)
Now
LY' = l:(P + d)' = LY' + 2l:Pd + r.d'
= r._y' + r.d'
389
using (13.3.6). This proves the first result (13.3.2). To obtain the second
result, we have
1:.p2 = 1:.(b,x, + b2x,)2
= b,'1:.x,' + 2b,b,1:.x,x 2 + b/1:.x,'
Reverting to the normal equations, multiply the first one by b the second
by b, and add. This gives "
b,'1:.x,' + 2b,b,LX,X, + b/LX,' = b,1:.x,y + b,1:.x,y
This establishes (13.3.3); the shortcut method of computing the reduction
1:.P' in S.S. due to regression.
EXAMPLE 13.3.1~Here is a set often triplets for easy computation.

X, X, Y X, X, Y

29 2 22 16 1 12
1 4 26 26 1 13
5 3 23 IS 4 30
27 1 11 6 2 12
25 3 25 10 3 26

ISums 160 24 200

(i) Calculate the regression, f = D.241XI + 6.829X2 - 0.239


(ii) Predict the value of Y for the fourth member of the sample. (Xl = 27, X2 = 1).
Am. 13.07.
EXAMPLE 13.3.2-10 the preceding example, compute the total S.S. of Y and the
S.S. due to regression. Hence, find the sum of squares of deviations. Ans. 35.0.
EXAMPLE 13.3.3-Show that after allowing for the effects of the other variable. both
XI and X2 have a significant relation with Y.
EXAMPLE 13.3.4-Note that when X, is fined ;done, the regressjon coefficient is
nesallve; i.e., Y tends to decrease as Xl increases. Wben X2 is included. tbe coefficient b l be·
comes significantly positive. From the normal equations the following relation may be
proved:
bY!'l = by! - bYl'lb;~-"'"
where b 2 J =- l:.x I x ~r.x 12 is the regression of Xl on Xl' If b y2 · 1 is positive and bl1 is nega-
tive, as in this example, the term - h n .,b21 is positive. If this term is large enough it can
chanlea negativeb rt into a positive by 1'2'
13.4--Altematlve method of calculation, The inverse matrix. For
many purposes, including the construction of confidence intervals for the
p's and the making of comparisons among the b's, some additional quan-
tities must be computed. If it is known that these will be needed, the
calculations given in preceding sections are usually altered slightly, as will
be described.
On the left side of the normal equations, the quantities 1:.x,', LX,X"
and 1:.x,' appear. The array
390 Chopt.r 13: Multiple Regressio~

l:X/ ~X1X2)'
( LX t X Ex/
2

is called a matrix with 2 rows and 2 columns-tbe matrix of sums of


squares and products. Mathematicians have defined the invt'fse of this
matrix, this being an extension to two dimensions of the concept of the
reciprocal of a number. The in verse is also a 2 x J matrix:

' ell C12)


( e21 ,ell

The elements 0ij, called also the Gauss multipliers, are found by solving
two sets of equations
First Set Second Set
1
C II I: X I + c 12 I:x 1X 2 =1 C21LX12 + C22LXtX2 = 0
C l1 LX J X 2 + G' 12 :rX/ =0 C21l:X,X2 + cuI:x/ = I
The left side of each set is the same as that of tbe normal equations, The
right sides have 1 0, or 0, L respectively, The first set give C and "",
the second set c" and e22' It is easy to sbow that c" : "", "
In the). x 2 case the solutions are:
ell: r.x,'/D: 12 = Cli : -I:x,x,/D : e" = Lx,'/D,
where, as before,
D: (r.x,')(Lx,') - (Lx,x,)'
Note that the numerator of C II is 1:..'( 2. 2 , nol LX 12. Note also the negative
sign in c 12 .
In the example, the matrix of sums of squares and products was
1,752,96 1.085,61)
( 1,085,61 3,155,78
with D = 4,353,400, This gives
c" = 3,155,78/4,353,400 =, 0,0007249
c" = - I,085M/4,353,400 = -0,0002494
e" = 1,752,96/4,353.400 = 0,()004027
From the e's, the b's are obtained as the sums of products of the c's
with the right sides of the normal equations, as follows:
b) = CIIl:X,y + c 1 itx 2 y
= (0,0007249)(3,231,48) + (-0,0002494)(2,216,44) = L7897 (13,4, I)
b z = '1ILX1Y + CU I. X 2Y
= (_ 0,0002494)(3,231.48) + (0,0004027)(2,216,44) = 0,0866 (13,42)
The main reason for finding the c's is that they provide the variances
and the covariance of the b's, The formulas are:
391

where a 2 is the variance of the residuals of Y from the regression plane.


To summarize. if the c's are wanted. they are computed first from the
normal equations; then the b's are computed from the c's as above. The
deviations sum of squares and the analysis of variance follow as in sec-
tion 13.3. Some uses of the c's are presented in the next section.
EXAMPLE 1~.4.1 To prov!! the rdations hI = c1lI.X1Y + (12Ix2Y ; b 2 = ('HI:xty
+ cnL'2Y' use these relations (0 substitute for hI and hl in terms of the c's in the left side
of the first normal equation. Then show. by the first equation satisfied by the c's in each set.
that this left side equals LX lY' Similarly, you can show that the left side of the second
normal equation equals l:x 2y. This proves that the h's computed as above are solutions of
the normal equations.
EXAMPLE 13.4.2 -Show (I) that hi and f?i have zero correlation only if';z l X2 == 0:
(ii) thaI. in this event. the regression coefficient of Yon XI is the same whether X2 is included
In the regression or no(. This IS the condition that holds for the main effects of each factor
in a factorial experiment.

13.5-Standard errors of estimates in muhiplt regression. 1n section


13.3 we found that the deviations mean square S' was 427.6 with 15 df.
giving.f = 20.7. The standard errors of 6, and 6, are therefore

s" = SJeIl = 20.7~OOO7249 = (20.7)(0.0269) = 0557 113.5.1)

.Ib , = SJell = 20.7.jO.0004027 = (20.7)(0.0201) = 0.416 (13.5.2)


It can be proved that the quantity (b, - P,)/s" is distributed as {with
('1 - k) or 15 df The null hypothesis p, = 0 can be tested as usual:
I, = b,/s" = 1.7898/0.557 = 3.21··
I, = b,/s" = 0.0866;0.416 = 0.21
These I-tests are identical to the F-tests of the same hypotheses made in
table 13.3.2. Note that (3.21)2 = 10.30 and (0.21)' = 0.04. Ihese being
the two values of F found in table 13.3.2.
Evidently in Ihe populalion of soils that were sampled the fraction
of inorganic phosphorus is the better"predictor of the plant-available
phosphorus. The experiment indicates "that soil organic phosphorus per
se is not available to plants. Presumably. the organic phosphorus is of
appreciable availability to plants only upon mil1eraJjzation~ and in the
experiments the rate of mineralization at 20"e. was too low to be of mea-
surable importance."
Confidence limits for any Pi are found as usual. For fJ" 95% limits are
h, ± to.O,Sb, = 1.790 ± (2.131)(0.557) = 0.60 and 2.98
Sometimes. comparisons among the hi are of interest. The standard
error of any comparison "kL;h; is
,J(2. ti 'C'i-i -+-=C2'i.=t""'it'-j-C,-'.,J (! .>.5.31

For example. the slandard error of (b, - b,l is


2c,,) = (20.7).JO.0007249 + 0.0004027 - 2( -0.00(2494)
= (20. 7).Jl(j:ooJ. 6264j = 0.835
When the regression is constructed for purposes of prediction, we
wish to know how accurately f predicts the population mean of Y for
specified values of X, and X,. Call tbis mean ". For instance, we might
predict the average weight of II-year-old boys of specified height" X, and
chest girth X,. The formula for the estimated standard error of f = {J is
" ,.J(I/n + c"x, + c"x7 + 2c 12 x,x,)
= (13.5.4)
Example: For the value of Yat the point X, = 4.7, X, = 24 (soil sample
5 in table 13.2.1): x, = 4.7 - 11.9 = - 7.2, x, = 24 - 42.1 =- IS.I;.o
the standard error of the estimate is r
(20.7)"/(1/18 + (0.0007249)( _7.2)' + (0.0004027)( -18.1)'
+ 2( -0.0002494)( -7.2)( -18.1)J = ±S.25 ppm
Alternatively, f may be used to predict the value of Y for an indi-
vidual new member Y' of the population (that is, one not included in the
regression calculations.) In this case,

J, = s I + -1 + c"x , , + C"X, " + 2C ll X , X, (13.5.5)


n
This result is subject to the assumption that the new member comes
from the same population as the original data. Unless the predictions
satisfy this condition, the standard error should be regarded as tentative.
It will be too low if the passage of time or changes in the environment
have changed the values of the Ws. If numerous predictions are being
made, a direct check on their accuracy should be made whe!lever possible.
Finally, the standard error of (Yj - f), where Yj is one of the ob-
servations from which the regression was computed, is (l.,/g, where

(13.5.6)

However, if the deviation (Yo - f) has aroused attention because it


looks suspiciously large, we cannot apply a t-test of the form
1= (Y, - Y)/s.,/g, for two reasons. The quantities (Y, - ?) and s are not
independent, since (Y, - f)' is a part of the deviations S.S. Secondly, we
must allow for the fact that (Y, - Y) was picked out because it looks large.
A test can be made as follows. The quantity
.,' = [I:(Y - ?)' - (Y, - p)'/g]/(n - k - I)
can be shown to be the mean square of the deviations obtained if the
suspect ~ is omi.tted when fitting the regression. If ~ were a randomly
ch~sen observation. the quantity r' = (Y, - f)/s'.Jg would follow the
,-dlstnbutlon WIth (n - k - I) dj: To make approximate allowance for
393
the fact that we selected the largest absolute deviation, we regard the
deviation as significant at the 5% level if t' is significant at the level 0.05/11.
(This may require reference to detailed tables (3) of I.)
To illustrate, it was noted (section 13.2) that the deviation + SH.8
for soil 17 is outstanding. The value of g i; found to be 0.SOO47, while
I;(Y- 1')' is 6,414 (section 13.3) with IS df. Hence,

s" = -L[6414 -
14'
(58~8l']
0.80047
= (6,414 - 4,319)/14 = 150

t' = (58.8)/J(lSO)(O.80047) = 5,36 (14 df.)


Since 0.05/18 = 0.0028, the question is whether a value of5.36 exceeds
the 0.0028 level of I with 14 dI Appendix table A 4 shows that the 0.001
level of I is 4.140. The deviation is clearly significant after allowance for
the fact that it is the largest. If the regression is recomputed with soil 17
excluded, the main conclusion is not altered. The value of h, drops to
1.290 but remains significant, while b, becomes -0.111 (non-significant).
EXAMPLE 13.5.1-ln the phosphorus data. set 95% confidence limits for /1 1 , Ans.
-0.79 to 0.97 ppm.
EXAMPLE \3.S.2-For a new soil having Xl = \4.6, X 2 = 5\, predict the value of Y'
and give the standard error of your prediction. Ans. f = 61.86. s.e. = ± ~ 1.5 ppm. usmg
formula 13.5.5.
EXAMPLE 13.5.3- If Y, is one of the obse{v~tions from which the regression was com-
puted. the variance of Y; ~ f is (formula 13,5.6),

If this expression is added over all the n sample values, we get

From the equations for the c's, show that the above equals al(n - 3). This is one way of
seeing that 1:( Y - f)l has (n - 3) df.

EXAMPLE 13.5.4-With soil 17 omitted. ~e have

1:X1 = 188.2 :EX1 = 700 l:Y~I,295

1:.\"",1 = 1,519.30; "I:X1x2 = 835.69; :Ex/ = 2.888.47:


:Ex,y = 1,867.39; LX 2Y = 757.47; l:y' ~ 4,426.48

Solve the normal equations and verify that b l = 1.290. b 2 = -0.111. deviations S.S.
~ 2.101.

13,6-The interpretation of regression coefticients, In the many areas


of research in which controlled experiments are not practicable. multiple
regression analyses are extensively used in attempts to disentangle and
measure the effects of different X-variables on some response Y. There
394 Cloapler 13: Multiple Regression
are, however, important limitations on what can be learned from this
technique in observational studies. While the discussion will be given
for a regression on two X-variables. the conclusions apply also when there
are more than two. The multiple linear regression model on which the
analysis is based is
Y=a+PIXI +p,X,+. (13.6.1)
where the residuals. are assumed to be distributed. independently of the
X's, with zero mean and variance u'. (The assumption of normality of
the .'s is required for tests of significance, but not for the other standard
properties of regression estimates.) We assume that the X's remain fixed
in repeated sampling.
In an observational study the investigator looks for some suitable
source in which he can measure or record a sample of the triplets
(XI' X" Y). He may try to select the pairs (XI' X,) according to some
pian. for instance so as to ensure that both X's vary over a substantial
range and that their correlation is not too high. though he is limited in this
respect by what the available source can provide.
Difficulty arises because he can never be sure that there are not other
X-variables related to Y in the population sampled. These may be vari-
ables that he thmks are ummportant. variables that are not feasible to
measure or record. or variables unknown to him. Consequently. instead
of (13.6.1) the correct regression model is likely to be of the form
Y = a + PIXI + p,X, + p,X, + ... + p,X, + •
where X, ... X, represent these additional variables. and k may be fairly
large. To keep the algebra simple we replace the additional terms in the
model. p,X, + ... + P,X,. by a single term p,X" which stands for the
joint effect of all the terms omitted from the two-variable model. Thus
the correct model is
Y = a + PIXI + p,X, + p,x, + .. (13.6.2)
~

where. represents that part of Y that is distributed independently of


XI' X" and X,.
The investigator computes the sample regression of Yon X I and X 2
as in preceding sections. obtaining the regression coefficients hi and b 2 -
Under the correct model (13.6.2). it will be proved later that b l is an un-
biased estimate. not of PI, but of
PI + p,b'I" (13.6.3)
where b. i ., is the sample regression coefficient of X, on XI' after allowmg
for the effects of X,. Clearly, b l may be either an overestimate or an
underestimate of PI' Since the bias in b l depends on variables that have
not been measured, it is hard to form a judgment about the amount of bias.
For example. an investigator might try to estimate the effects of
nitrogen and phosphorus fertilizers on the yield of a common farm
395
crop by taking a sample of farms. On each field he records the crop
yield Yat the most recent harvest and the amounts XI' X, of Nand Pper
acre applied in that field. If, however, substantial amounts of fertilizer
are used mainly by the more competent farmers, the fields on which XI
and X, have high values will, in general, have better soil, more potash
fertilizer, superior drainage and tillage, more protection against insect
and crop damage, and so on. If P.X. denotes the combined effect ofthese
variables on yield, X. will be positively correlated with XI and X" so that
b. I ., will be positive. Further, P. will be positive if these practices in-
crease yields. Thus the regression coefficients b l and b, will overestimate
the increase in. yield caused by additional amounts of Nand P. This type
of overestimation is likely to occur whenever the beneficial effects of an
innovation in some process are being estimated by regression analysis, if
the more capable operators are the ones who try out the innovation.
When the purpose is to find a regression formula that predicts,Y
accurately rather than to interpret individual regression coefficients, the
bias in b I may actually be advantageous. Insofar as the unknown vari-
ables in X, are good predictors of Yand are stably related to XI' the regres-
sion value of b I is in effect trying to improve the prediction by capitalizing
on these relationships. This can be seen from an artificial example (in
which X, is omitted for simplicity). Suppose that the correct model is
Y = I + 3X.. This implies that in the correct model (i) XI is useless as a
predictor, since PI = 0, (ii) if X, could be measured, it would give perfect
predictions, since the model has no residual term e. In the data (table
13.6.1), we have constructed an XI that is highly correlated with X,.
You may check that the prediction equation based on the regression of
Yon Xl'
f. = 2.5 + 3.5XI
gives good, although not perfect, predictions. Since P. = 3, b. 1 = 7/6,
hI = 7/2, the relation b l = p,b'l is also verified.
TABLE 13.6.1
ARTIFIC'lAL EXAMPLE TO ILLUSTRATE PREDICTION FROM AN INCOMPLETE REGRESSION MODEL

Observation X, y= I + 3Xo X, ' 1', y- 1',


I I 4 0 2.5 +1.5
2 2 7 2 9.5 -2.5
3 4 13 3 13.0 0.0
4 6 19 5 20.0 -1.0
5 7 22 5 20.0 +2.0

Sum 20 65 15 65.0 0.0


Mean 4 13 3 13.0 0.0

63 21 1
1::<12 = 18. L,"(l.\' = 63, 1:XO.\"1 = 21. hi = 18 = 3.5, but = 18 = 6
396 Chapt.r 13: Multiple Regre..ion
To return to studies in which the sizes of the regression coefficients
are of primary interest, a useful precaution is to include in the regression
any X-variable that seems likely to have a material effect on Y, even
though this variable is not of direct interest. Note from formula 13.6.3
that no contribution to the bias in b, comes from fl,. since X, was in-
cluded in the regression. Another strategy is to find, if possible, a source
population in which X-variables not of direct interest have only narrow
ranges of variation. The effect is to decrease h"., (see example 13.6.1)
and hence lessen the bias in b,. It also helps if the study is repeated in
diverse populations that are subject to different X. variables. The finding
of stable values for b, and b, gives reassurance that the biases are not
major.
In many problems the variables X, and X, are thought to have causal
effects on Y. We would like to learn by how much Y will be increased
(if beneficial) or decreased (if harmful) by a given change AX, in X,.
The estimate of this amount suggested by thc multiple regression equation
is b,AX,. As we have seen, this quantity is actually an estimate of
(P, + P.b.,. ,)AX!" Further, while we may be able to impose a change of
amount AX, in Xl we may be unable to control other consequences of
this change. These consequences may include changes AX, in X, and
AX. in X•. Thus the real effect ofa change AX, may be, from model 13.6.2,
p,AX, + p,AX, + fl.AX.. (13.6.4)
whereas our estimate of this amount, which assumes that AXI" can be
changed without producing a change in X, and ignores the unknown vari-
ables, approximates (P, + P.b".,)AX,. If enough is known about the
situation, a more realistic mathematical model can be constructed, per-
haps involving a system of equations or path analysis (26, 27). [n this
way a better estimate of 13.6.4 might be made, but estimates of this type
are always subject (0 hazard. As Box (4) has remarked, in an excellent
discussion of this problem in industrial work, "To find out what happens
to a system when you interfere with it you have to interfere with it (not
just passively observe it)."
To sum up, when it is important to find some way of increasing or de-
creasing Y, multiple regression analyses provide indications as to which
X-variables might be changed to accomplish this end. Our advance esti-
mates of the effects of such changes on Y, however, may be wrong by
substantial amounts. [f these changes are to be imposed, we should plan,
whenever feasible, a direct study of the effects of the changes on Y so
that false starts can be corrected quickly.
In controlled experiments these difficulties can be largely overcome.
The investigator is able to impose the changes (treatments) whose effects
he wishes to measure and to obtain direct measurements of their effects.
The extraneous and unknown variables represented by Xv are present just
as in observational studies. But the device of randomization (5, 6) makes
X, in effect independent of X, and X, in the probability sense. Thus X.
397
acts like the residual term t in the standard regression model and the
assumptions of this model are more nearly satisfied. If the effects of X,
are large. the Deviations mean square, which is used as the estimate of
error, will be large, and the experiment may be too imprecise to be useful.
A large error variance should lead the investigator to study the uncon-
trolled sources of variation in order to find a way of doing more accurate
,experimentation.
We conclUde this section with a proof of the result (equation 13.6.3);
namely, that if a regression of Yon X, and X 2 is computed under the model
E(t') = 0,
then
(13.6.:1)
The result is important in showing that a regression coefficient is free from
any bias due Co ocher X's like X 2 chac are included in Che fitted regression.
but is subject to bias from X's that were omitted. Since it is convenient
to work with deviations from the sample means. note that from the model.
we have
( 13.6.5)
Now,
h, = c"I:x,y + CUI:X2}'
Substitute for y from 13.6.5.
b, .= cllI:x,(P,x, + P2X2 + P,x, + t' - n
+ c12 I:x,(P,x, + P2X2 + p,x, + t ' - n
When we average over repeated samples. all terms in e'. like ell I:. x I': .
vanish because t' has mean zero independently of XI. X2' and X,. Collect
terms in PI' P2. and p,.
E(bd = P,(cllI:x/ + C12 I:X IX2) + P2(ClltXIX2 + c12Lx,')
+ {3o(cIlI:.X t Xo + C I2 I:. X 2 X o)
From the first set of equations satisfied by Cj, and C12 (section 1:1.4), the
coefficient of p, is 1 and that of P2 is O.
What about the coefficient of P, o Notice that it resembles
C11LXtY + c t2 I:x 1 y = hi'
except that XI) has replaced y. Hence, the coefficient of flo is the regression
coefficient b"'2 of X, on XI that would be obtained by computing the
sample regression of X, on XI and X,. This completes the proof.
EXAMPLE l.l.6.1 This iIIwilrate, the result that when there are omitted variables
denoted by X". th\! bias that they create In b l dqx:nds both lm the ~ile It" of their cfred on
r and on the extern to which Xg varies. Let Y = Xl + X... so that jJ 1 = {1" = I. In sample I,
398 Chapter 13: Multiple Regression
XI and X" have the same distribution. Verify that h, = 2. In sample 2, XI and X", still have
a perfect correlation but the variance of Xo is greatly reduced. Verify that b l is now 1.33.
giving a much smaller bias. Of course, steps that reduce the correlation between XI and Xo
are also helpful.

Sample 1 Sample 2

x, x. y x, X. Y
----~-.--

-6 -6 -12 -6 -2 -8
-3 -3 - 6 -3 -I -4
a o a 0 0 0
o o o 0 0 0
9 9 18 9 3 12
Sum o o o Sum 0 0 0
I:x,2= 126. l:.t'I)' = 168

t3.7-Relati>e importance of different X-variables, In a multiple-


regression analysis the question may be asked: Which X variables are
most important in determining Y? Usually. no unique or fully satisfac-
tory answer can be given. but several approaches have been tried. Con-
sider first the situation in which the objective is to predict Y or to "explain"
the variation in Y. The problem would be fairly straightforward if the
X-variables were independent. From the model
Y= ~ + PIXI + fJ2X, + ... + fJ,X, + t
we have, in the population,
a/ = iJl2a/ + /322(122 + ... + p}26/' + (12
where a/ denotes the variance of Xj' The quantity /~j2(J//U/ measures
the fraction of the variance of Y attributable to its linear regression on Xi.
This fraction can be reasonably regarded as a measure of the relative
importance of Xi. With a random sample from this population. the
quantities hi'S.x,' ('i.y2 are sample estimates of these fractions. (In small
samples a correction for bias might be advisable since bi''i.x//'i.)'' is not
an unbiased estimate of /3/(1/la.,2.)
The square roots of these quantities. bi.J('i.x,' !'i..?), called the stan-
dard partial regression coejJicients, have sometimes been used as measures
of relative importance. the X's being ranked in order of the sizes of these
coefficients (ignoring sign). The quantity .J(Lx,'i'i.y') is regarded as a
correction for scale. The coefficient estimates Pi(};/fJy. the change in Y,
as a fraction of "r, produced by une S.D. change in Xi.
In practice. correlation~ between the X's make the answer morc
difficult. In many applications, Xl and Xz are positively correlated with
each other and with Y. For instancl:, X, and X2 may be examination
scores that predict a student's ahihty to do wen in a CO\,1r~e, and Y his final
score in that course. To illustrate this case~ tah!e 13. 7~ I shows the normal
399
equations, the b's and the analysis of variance. As the example is con-
structed, Xl is a slightly better predictor than X 2, the two together ac-
countmg for about 70/~ of the variation in Y (reduction due to regression
26.53 out of a iotal 5.5. of 38.00).
As is typical in such applications, each variable's contribution to
~y2 is much greater when the variable is used alone than when it follows
the other variable. For Xl the two sums of squares are 22.50 and 9.63,
respectively, while for X 2 they are 16.90 and 4.03. If the sums of squares
when Xl and X 2 appear alone are taken to measure the contributions of
Xl and X, to the variation in Y, the two contributions add to 39.40,
which is more than ~y2 (38.00). On the other hand the sums of squares
9.63 and 4.03 greatly. underestimate the joint contribution of Xl and X 2.
Neither method of measuring the relative contribution is satisfactory.

TABLE 13.7.1
A COMMON SITUATION IN TWO-VARIABLE REGRESSION. ARTIFICIAL DATA

Normal equations:
IOh l + 5h z = 15
5h z + 10h z = 13
t'I]=C 22 =2/15 c 12 =-li15
b1 =17/1S h2 =11/15
-- ~.==~-~=_~=~~-=====
Source of Variation Degrees. of Freedom Sum of Squares
_._-----+------
Total 52 38.00
Regression on XI alone {II (I:xIy)1/I:X12 15 2 /10 = 22.50
=
Regression on X 2 after Xl b/jC2l = 112/_30-.=; 4.0)
Regression on X 2 alone f.11 (IxV')~/I:X21 = 13 2 /10 = 16.90
Regression on X) afterX2 1 b l 2/e ll = 17 2/30= 9.63

l?:_vi_"'_io_"______. ____5...:0_ _ _ _ _ _ _ _ _ _ _ _ _ 11,47

Sometimes the investigator's question is: Is Xl when used alone a


better predictor of Y than X2 when used alone? In this case, comparison
of the numbers 22.50 and 16.90 is appropriate. An answer to the question
has been given by Hotelling (7) for two X-variables and extended by Wil-
liams (8) to more than two.
In other applications there may be a rational way of deciding the
order in which the X's should be brought into the regression, so that their
contributions to L 1'2 add up to the correct combined contribution.
In his studies of the 'variation in the yields of wheat grown continuously
on the same plots for many years at Rothamsted, Fisher (2) postulated
the sources of variation in the following order: (I) A steady increase or
decrease in level of yie_ld, measured by a linear regression on time: (2)
other slow changes in yields through time, represented by a polynomial in
time with terms in T2, T', T 4 , T': (3) the effect of total annual rainfall on
the deviations of yields from the temporal trend: (4) the effect of the dis-
400 Chapter 13: Multiple Regrenion
tribution of rainfall throughout the growing season on the deviations
from the preceding regression.
Finally, if the purpose is to learn how to change Yin some population
by changing some X-variable, the investigator might estimate the sizes
h%l' AX2 , etc., of the changes that he can impose on Xl and X 2 in this
population by a given expenditure of resources. He might then rate the
variables in the order of the sizes of b,ilXi' in absolute terms, these being
the estima.ted amounts of change that will be produced in Y. As we have
seen in the preceding section, this approach has numerous pitfalls.
13.1I-Partial and multiple correlation. In a sample of 18-year-old
college freshmen, the variables measured might be height, weight. blood
pressure, basal metabolism, economic status, aptitude, etc. One purpose
might be to examine whether aptitude (y) was linearly related to the
physiological measurements. If so, the regression methods of the preced-
ing sections would apply. But the objective might be to study the correla-
tions among such variables as height, weight, blood pressure, hasal
metabolism, etc., among which no variables can be specified as inde-
pendent or dependent. In that case, partial correlation methods are ap-
propriate.
You may recall that the ordinary correlation coefficient was closely
related to the bivariate normal distribution. With more than two vari-
ables, an extension of this distribution called the multivariate normal
distribution (9) farms the basic model in correlation studies. A property
of the multivariate normal model is that any variable has a linear regres-
sion on the other variables (or on any subset of the other variables), with
deviations that are normally distributed. Thus, the assumptions made
in multivariate regression studies hold for a multivariate normal popula-
tion.
If there are three variables, there are three simple correlations among
them, Pl2, Pl3' P23' The partial correiation coefficient, Pll-3, is the cor-
relation between variables I and 2 in a cross section of individuals all
having the same value of variable 3; the third variable is held constant so
that only I and 2 are involved in the correlation. In the multivariate
normal model, P12'3 is the same for every value of variable 3.
A sample estimate r12 '3 of Pl2" can be obtained by calculating the
deviations d13 of variable I from its sample regression on variable 3.
Similarly, findd 23 • Then r 12" is the simple correlation coefficient between
d13 and d'3' The idea is to measure that part of the correlation between
variables 1 and 2 that is not simply a reflection of their relations with
variable 3. It may be shown that r 12 ' 3 satisfies the following formula:
'12 - 'I3 r n
'12"3 = I 2. 2
,,(1 - r13 )(1 - r" )
Table A II is used to test the significance of r 12 .,. Enter it with (n - 3)
degrees of freedom, instead of (n - 2) as for a simple correlation co-
efficient.
401

In Iowa and Nebraska, a random sample of 142 older women was


drawn for a study of nutritional status (12). Three of the variables were
Age, Blood pressure, and the Cholesterol concentration in the blood. The
three simple correlations were
r AB = 0.3332, rAc = 0.5029, r BC = 0.2495

Since high blood pressure might be associated with above-average


amounts of cholesterol in the walls of blood vessels, it is interesting to
examine 'Be' But it is evident that both Band C increase with age. Are
they correlated merely because of their common association with age or
is there a real relation at every age? The effect of age is eliminated by
calculating
_ 0.2495 - (0.3332)(0.5029) = 0 1233
r
BC
'A - J(I _ 0.33322)(1 _ 0.50292) .
With f = 142 - 3 = 139, this correlation is not significant. It may be that
within the several age groups blood pressure and blood cholesterol are
uncorrelated. At least, the sample is not large enough to detect the cor-
relation if it is present.
As another illustration, consider the consumption of protein and fat
among the 54 older women who came from Iowa. The simple correlations
were
rAP = - 0.4865, r AF = - 0.5296, r PF = 0.5784

The third correlation shows that protein and fat occue together in all diets
while the first two correlations indicate the decreasing quantities of both
as age advances; both P and F depend on A. How closely do they depend
on each other at anyone age? -

r - 0.5784 - (-0.4865)(-0.5296) _ 0 2
PF-A - J(l _ 0.48652)(1 _ 0.52962) - .43 8
....
Part of the relationship depends on age but part of it is inherent in the
ordinary composition of foods eaten.
To get a clearer notion of the way in which rpF'A is independent of
age. consider the six women near 70 years of age. Their protein and fat
intakes were
P: 56, 47, 33, 39 42, 38
F: 56. 83, 49, 52, 65, 52 ' " = 0.4194
The correlation is close to the average. 'PF" = 0.4328. Similar correla-
tions would be found at other ages.
With four variables the partial correlation coefficient between vari-
ables I and 2 can be computed after eliminating the effects of the other
variables. 3 and 4. The formula is
402 Chapter 13: Multiple Regression

r12 '34 = .j , )(1 ,)


,(1 - r13'4 - r23"

or, alternatively,

r12'34 = .j , )(1 2 '


(1 - r'4'3 - r'4'3 )

the two formulas being identical.


To test this quantity in table A 11, use (n - 4) degrees of freedom.
As we have stated, partial correlation does not involve the notion of
independent and dependent variables: it is a measure of interdependence.
On the other hand, the multiple correlation coefficient applies to the situa-
tion in "hich one variable, say Y. has been singled out to examine its joint
relation with the other variables. In'the population, the multiple correla-
tion coefficient between Y and XI' X" ... , X, is defined as the simple
correlation coefficient between Yand its linear regression, PIX} + ...
+ /i,X" on XI ... X,. Since it is hard to attach a useful meaning to the
sign of this correlation, most applications deal with its square. The sample
estimate R of a multiple correlation coefficient is, as would be expected,
the simple correlation between y and y = b,x, + ... + b,x.. This gives
R' = (Lyy)'j(Lyl)(Lyl)
In formula 13.3.6 (p. 388) it was shown that Ldji = 0, where d = y - .1'.
It follows that Lyjl = Ly2 Hence,
R' = LP'( Ly 2 I - R' = Ld 2jLy2
Thus, in the analysis of variance of a mUltiple regression, R' is the frac-
tion of the sum of squares of deviations of Y from its mean that is at-
tributable to the regression, while (I - R2) is the fraction not associated
with the regression. This result is a natural extension of the correspond-
ing result (section 7.3) for a simple correlation coefficient. The test of the
null hypothesis that the multiple correlation.in the population is zero is
identical to the F-teot of the null hypothesis that PI = /i, = ... = p, = O.
The relation is •
F= (n - k - I)R'/k(I - R2), with k and (n - k - I) df
EXAMPLE 13.S.I-Brunson and Wilber (13) examined the correlations among ear
ci~umference E, cob circumference C. and number of rows of kernels K calculated from
measurements of 900 ears of corn:
rEC = 0.799, rEX. = 0.570, . rCK == 0.501
Among the ears having the same kernel number, what is the correlation between E and C?
Ans .. rEc'K = 0.720.
EXAMPLE )3.8.2 -Among ears of corn having the same circumference. is there any
correlation between C and K? Ans. r("l(.F. = 0.105.
EXAMPLE 13.8.3-ln a random sample of 54 Iowa women t \ 2). the intake of two
403
nutrients was detennined together with age and the concentration of cholesterol in the blood.
If P symbolizes protein, F fat. A age. and C cholesterol, the correlations are as follows:

A p F
P -0.4865
F -0.52% 0.5784
C 0.4737 -0.4249 -0.3135

What is the correlation between age and cholesterol independent of the intake of protein
and fat? Ans.
0.3820 - (-0.2604K -0.3145) _ 0.3274
J(I - 0.2604'XI - 0.3145')
EXAMPLE I3.S.4-Show that the sample estimate of tbe fraction of the variance of Y
that is attributable to its linear regression on Xl ... X. is

(I - R')(n - I)
1- .
(n - k - I)

13.9-Three or more independent variables. Computatioas. The


formulas already described for two X-variables extend naturally to three
or more X-variables. The computations inevitably become lengthier:
they are ideally suited to an electronic computer. We shall describe one of
the standard methods for a desk calculating machine-the Abbreviated
Doolittle method (lO)-except that for clarity more steps are given than
an experienced operator needs. For more extensive discussion of Com-
puting methods. see (11).
With three independent variables. the normal equations are:
b,};x/ + b,};x,x, + b,};x,x, = };x,y
b,};x,x, + b,};x/ + b,};x,x, = };x,y
h.,};x,x, + b,};x,x, + b,};x/ = };x,y
If the c's are needed, as in most applications, the right sides become
1,0,0 for C ll • e 12 • e 1 3; 0, 1,0 for C21' e 22 • e23; and 0, 0,1 for ell> C32. C33:
Since the same calculating routine can be.used for b's and c's, only
the right sides being different, we denote the unknowns by z,' z" z" and
let slj = };x,Xj' The equations to be solved are:
(1) 5UZl + ,S12Z2 + 513%3 ::::
(2) 5 12 Z 1 + 522ZJ + S2J Z 3 ==
(3) SUZI + 523%2 + 53JZ) =
The right side is not specified, since it depends on whether the b's or c's are
being computed
The Doolittle method eliminates z,' then z, and z" solving for z,.
Intermediate steps provide convenient equations for finding Z2 from f3'
and finally z, fromZ, and %,. The computing routine can be carried out
404 Chapter 13: Mubiple Regression
without any thought as to why it works. The explanation is given in this
section.
The first step, line (4), is to recopy line (I).
(4)
Now divide through by,,,. It is quicker to find the reciprocal, 1/5", and
multiply through by l/s". This gives

(5)

The coefficients of", and z, have been bracketed, since they playa key
role. MUltiply (4) by '12/'''' obtaining
SI/ 5 12 S 13
(6) SI2':1' + _- =2 + .~~ ZJ =
S)1 Sli

In steps ($) and (6) and in all subsequent steps, the right side of the
equation i, always multiplied by the same factor as the left side. Now sub-
trad (6) from (2) t<>gel rid of z ,.

(7)

The next operations resemble those in.lines (4) to (6). Find the reciprocal
of(522 - S12' !s,,) and multiply (7) by this reciprocal.

(8) _-l-{~L-
-, .
SI2'~LJ/Sll)t
. 'I'
'~2:! - ,"1 12 5'1 I~,
=

The coefficient of z, in (8) receives a -curly bracket. like that of z, in (5).


Reverting to (4) and (5), mUltiply (4) by the bracketed s"!s,, in (5).

(9)

Similarly, multiply (7) by ihe bracketed coefficient of z, in (8)

(10)

Now iake (3) - (9) - (10). Note that the coetnelents of z, and z, both
disappear, leaving an equation (II) with only z, on the left. Solve this for
z,. (If there are four X-variables. continue through another cycle of these
operations, ending with an equation in which Z4 alone appears.)
Having z" find z, from (8), and finally z I from (5). With familiarity,
the operator will find that.lines (6). (9), and (10) need not be written
down when he is 'using a modern desk machine.
405
The next two sections give numerical examples of the calculation of
the b's and c's. The numbering or the lines and all computing instruct.ions
in these examples are exactly as in this section.
13.10-Numerical example. Computing the b's. In table 13.10.1 an
additional independent variable X, is taken from the original data in the
plant-available phosphorus investigation. Like X 2 , the variable X, mea-
sures organic phosphorus, but of a different type. As before, Y is the
estimated plant-available phosphorus in corn grown at Soil Temperature
20'C. The data for Soil Temperature 35°C. are considered later.

TABLE 13.10.1
PHOSPHORUS FRACTIONS IN VARIOUS CALCAREOUS Son.s. AND EsTIMATED PLANT-AVAILABLE
PHOSPHORUS AT Two SolI.. TEMPERATURES

I,
Soil Sample Phosphorus Fractions Estimated Plant-available
No. in Soil, ppm- Phosphorous in Soil, ppm

Soil Temp. Soil Temp.


20° C. 35' C.
X, X, X, Y Y'

I 0.4 53 158 64 93
2 0.4 23 163 60 73
3
4 I 3.r
0.6
19
34
37
157
71
61
38
109
5 4.7 24 59 54 54
6 1.7 65 123 77 107
7 '.4 44 46 81 99
8 10.1 31 117 93 94
9 11.6 29 173 93 66
10 12.6 58 112 51 126
II 10.9 37 III 76 75
12 23.1 46 114 96 lOS
13 23.1 50 134 77 90
14 21.6 44 73 93 72
15 23.1 56 168 <is 90
16 I 1.9 36 143 54 • 82
17 26.8 58 202 168 128
18 29.9 51 124 99 120

.,x. ==
Xl
inorganic phosphorus by Bray and Kurtz method
organic phosphorus soluble in K1CO J and hydrolyzed by hypobromite
X) = organic phosphorus soluble in K 2 CO,.and not hydrolyzed by hypobromite

In general, regression problems in which the b's but not the c's are
wanted are encountered only when the investigator is certain that an the
X's must be present in the regression equation and does not want to test
individual hi or compute confidence limits for any Pi' The present exam-
ple is a borderline case. A primary objective was to determine whether
there exists an independent effect of soil organic phosphorus on the
phosphorus nutrition of plants. 'That is. the investigators wished to
406 Chapter 13: Multiple Regression
know if X, and X, are related to Yafter allowing for the relation between
Yand XI (soil inorganic phosphorus). As a first step, we can work out the
regression of Yon all three variables, obtaining the reduction in sum of
squares of Y. The reduction due to a regression on XI alone is (l:xly)'j
l:x l '- By subtraction, the additional reduction due to a regression on
X, and X, is obtained. It can be tested against the Deviations mean square
by an F-test. If F is near 1, this probably settles the issue and the c's are
not needed. But if Fis close to its significance level, we will wont to.exam-
ine b, and b, individually, since one type of inorganic phosphorus might
show an independent relation with Y but not the other.
TABLE 13.10.2
SoLUTION Of THREE NORMAL EQUATIONS. ABBREVIATED DooLITTLE METHOD
-
Line 1 Reciprocal I Instructions X, X, X, Y
- - -----+------

--~
(I) 1.752.96 1.085.61 1,200.00 3,231.48
(2) 1.085.61 3.155.78 3,364.00 2,216.44
(3) 1,200.00 3,364.00 35,572.00 7.593.00
---_.. .. -- r-------~
(4) 1.752.96 1.085.61 1.200.00 3.231.48
(5) .0 3 570464 I (4) x .03 570464 I (.619301 {.68456} 1.84344
-- ----
(4) x (.61930) 672.32 743.16 2.001.26
(6)
--+------
(7) (2)-(6) 2,483.46 2.620.84 !215.18
(8) .OJ4{)2664 (7) x .0'402664 I 11.05532) : 08665
- -
(9) (oU x {.68456) 821.47 2,212.14
(10) (7) ~ (1.05532) 2.765.82 227.08
(II) (3)-(9)-(10) 31.984.71 5,153.78

-;- by 31~84.71 bJ = 0.16113


Line (8) b, ~ .08665 -- (1.05532)b, ' -0.08339
Line (5) b, ~ 1.84344 - (.61930)b, - (.68456)b,
hi"" 1.78478

ReductIon m 5.S. = IMIxiY) = (J.78478)(.I,.:'Jl.48) + . _. + (0.161 13)(7,593.00)


~ 6.&06.

The normal equations and computation of the b's are in table 13.10.2.
Before starting, consider whether some coding of the normal equations is
advisable. If the sizes of the l:x/ differ greatly. it is more difficult to
keep track of the decimal places. Division or multiplication of some X's
by a power of 10 will help. If Xi isdivided by lOP, LX,' is divided by 10 2 •
and lXix} or :tx,y by lOP. Note that b, is multiplied by I()P and therefore
must'be divided by lOP in a final decoding. For practice, see example
13.10.6. In this example no coding seems necessary.
It is hoped that the calculations can be easily followed frol11 the
column of Instructions. In the equations like (5) in which the coefficient
of the leading b, is 1, we carried five decimal places, usually enough with
407
three or four X-variables. Don't forget that the b's are found in reverse
order: b 3 • then b 2 • then hi' Since mistakes in calculation are hard to
avoid, always substitute the b's in the original equations as a check, apart
from rounding errors. At the end, the reduction in sum of squares of Y
is computed.
Table 13.10.3 gives the analysis of variance and the combined test of
X, and X,. Since F = 1.06, it seems clear that neither form of inorganic
phosphorus is related to Y in these data.
TABLE 13:)0.3
ANALYSIS OF VARIANCE AND TfST OF Xl. X3

Degrees of Sum of Mean


Source of Variation Freedom Squan:~ Square F

Total 17 12,390
Regression on XI' Xl. XI 3 6,806
Regression on XI I 5,957·

Regression on X2 • Xj after XI 2 849 424 1.06

Deviations 14 5,584 399

Some general features of mUltiple regression may now be observed:


I. As noted before, the regression coefficients change with each new
grouping of the X. With X, alone, byz = 2,216.44/3,155.78 = 0.7023.
Adding X" byz., = 0.0866. With three of the X, byz.13 = -0.0834. In
anyone multiple regression, the coefficients are intercorrelated; either
increasing or decreasing the number of X's changes all the b's.
2. The value of ~y' never decreases with the addition of new X;
ordinarily it increases. Take X, alone; ~y/ = (3,23].48)'/1,752.96
= 5,957. X, and X, make ~h/ = 5,976. For all three, ~Y'23' = 6.806.
The increase may be small and nonsignificant, but it estimates the con-
tribution of the added X.
3. For checking calculations it is worth noting that ~y' cannot be
greater than Ly'; nearly always it is less. Only if the X predict Yperfectly
can LP' = Ly'. In that limiting case. Ld' = O.
4. High correlation between two of the X can upset calculations.
If rij is above 0.95, even 6 or 8 significant digits may not be sufficient to
control rounding errors. Consider eliminating one of the two X's.
5. If Lji' is only a small fraction of Ly', that is, if R' is small.
remember that most of the variation in Y is unexplained. It may be
random variation or it may be due to other independent variables not
considered in the regression. If these other variables were found and
brought in, the relations among the X's already included might change
completely.
408 Chapter 13: Multiple R_essio.
EXAMPLE 13.1O.1-Compute the regression of plant-available phosphorus on the
3 fractions. ADs. f = 1.1848X1 - 0.0834X1 + 0.1611X l + 43.67.
EXAMPLE 13.to.2-Estimate the plant-available phosphorus in soil sample 17 and
compare it with the observed value. ADs. 119 ppm .• Y - f = 49 ppm.
EXAMPLE 13.10.3-The experimenter might have information which would lead him
to retain X) along with Xl in his predicting equation, dropping X 2 • Calculate the new regres-
sion. ADS.? = 1.737X1 + O.155XJ + 41.5.
EXAMPLE 13.IO.4..----Calculate the sum of squares due to Xl after Xl and X 3 • ADs. 16.
EXAMPLE 13.IO.5---Calculate R2 = I:pl/I:yl with XI alone, with Xl and X2• and with
XI> Xl, XJ' ADS. Rr.,l = 0.4808. R r . 11 2 = 0.4823, R r . I23 2 = 0.5493. Notice that Rl never
decreases with the addition of a new X; ordinarily it increases. Associate this with the cor~
responding theorem about 1:y2.
EXAMPLE 13.10.6-ln a multiple regression the original nonnal equations were as
fOllows:

x, X, X, Y

1.28 17.20 85.20 2.84


17.20 2,430.00 7,1.60.00 183.00
85.20 7,160.00 67,200.00 8,800.00
--------- -->-~--

It was decided to divide X 2 by 10 and X3 by 100 before starting the solution. What happens to
LX )x J • 1:'x 2y; LX 2X3, LX/, LX/, LX 3}'?- Ans. They become 0.852. 18.30. 7.16.6.72.24.30,
88.00.
EXAMPLE 13.10. 7~In studies of the fertilization of red clover by honey bees (28), it
was desired to learn the effects of various lengths of the insects' probosces. The measure-
ment is difficult, so a pilot experiment was performed to determine a more convenient one
that might be highly ~orrelated with proboscis length. Three measurements were tried on
44 bees with the results indicated:

Length of
Dry Weight, Length of Wing. Width of Wing, Proboscis,
n= 44 X, (mg.) X 2 (mm.) X.l(mm.) Y(mm.)
.~-.-

Mean 13.10 ..... 9.61 3.28 6.59

Sum of Squares and Products

X, X, X, Y

X, 16.6840 1.9279 0.8240 1.5057


X, 0.9924 0.3351 0.5989
X, 0.2248 0.1848
Y 0.6831

Coding is scarcely necessary. Carrying 5 decimal places. calculate the regression coefficients.
Ans. 0.0292, 0.6151, -0.2022.
EXAMPLE 13.10.8-Test the significance or the overall regression and compute the
value of R2. Am. F = 16.2,f = 3 and 40. P very sma". R2 = Q.5S. a disappointing value
when the objective is high accuracy in predicting Y.
409
EXAMPLE 13.10.9-Test the significance of the joint effect of Xl and X} after fitting
X2 - Ans. F == 0.87. Can you conclude anything about the relative usefulnes'i of the three
predictors?

B.ll-Numerical example. Computing the inverse matrix. Table


13.11.1 gives the worksheet in which the c's are computed. The comput-
ing Instructions (column 2) are the same as in sections 13.9 and 13.10.
The following points are worth noting:
1. In many problems the c's are small numbers with numerous zeros
after the decimal place. For those who have difficulty in keeping track
of the zeros the following pre-coding is recommended. Code each Xi' if
necessary, so that every :!:Xi2 lies between 0.1 and 10. This can always
be done by dividing X, by a power of 10. If X, is divided by lOP and X 2
by 10', then :!:x,' is divided by IO'P; :!:x,' by 10"; and :!:x,x, by lOP+'
In this example we had initially (table 13.10.2), :!:x.' = 1,752.96, :!:x,'
= 3,155.78, :!:X,2 = 35,572.00. Division of every Xi by 10 2 make the
first two sums of squares lie between 0.1 and 1, while :!:x/ lies between I
and 10 as shown in table 13.11.1. Every :!:XiXj is also divided by 10 4 .
The advantage is that the coded c's are usually not far from l. Five
decimal places will be carried throughout the calculations. .
2. The three sets of c's are found simultaneously. The computa,ions
in column 6 give C 11 • ell. Cu. those in column 7 give e 12 • Cu. C B , and those
in column 8 give e 1 3, e 23 , e33' Because of the symmetry, quantities like
"12 are found only once.
3: Column 9, the sum of columns 3 to 8, is a check sum. Since mis-
takes creep in, check in each line indicated by a .j that column 9 is the
sum of columns 3 to 8. In some lines, e.g. (6), this check does not apply
because of abbreviations in the method.
4. The first three numbers found in line (12) are cu, "13' c" in coded
form. Then we return to line (8). With column 7 as the right side, line
(8) reads .
C 22 + 1.05533c23 = 4.02658

With column 6 as the right side, line (8) reads


c 12 + 1.05533c 13 = - 2)19358

These give C22 and e12' Finally, ell comes from line (5).
5. To decode, cij is divided by the same factor by which ~XiXj
was divided in coding.
6. By copying tile :!:x;.Y next to the Cij' the hi are easily computed.
Then the reduction in S.S. due to regression and the Deviations mean
square are obtained. These enable the standard error of each hi to be
placed nex.t to hi' As anticlJYdted, nelther b l nor b J approaches the signifi·
cance level.
Occasionally there are several Y-variables whose sample regressions
on the same set of X-variables are to be worked out. 111 the phosphorus
-,. -"-"1 I ....-,. -,.
9
J;';~ J~
...g ~~~=
c;I) l': 00 ....

"'] i$- i§ =i
! c..;
Zl", i~-
(,)
"'0
...:..:'" "':00 c cc..; '"c HCCC
~

ooi .....'"
"'-
00

00- 00 0 00 00- "'''''


c

-.
~ ~:::!
... ,: - 0-0 00 0
00
~ ~~
.0--
'" '"
'" '"
'"
8:
......
.......
"c
~
1 ....
...$
I .0 ::
'" -00
-'"
~ ~
~
c
00 00
~~
~~
C,..j
1 1
;';;';8
;:t .... -
'" ....
"!"!~
000
1 1
:1
"'c
1
~~~
t-=
II
...
.:;

3
< ~i~
>
..,"' ... I ....
"':,..jC
~~~ ~~ .., ~~ ,..
1 ;::;
~ "'~ -"'I .
-: < -cc..;
.... '" -i
cc s
N '"
~~
... 00
~s:;-
- ... - -
N U

-- cc..; ...
ct c
f"'i . •
!S
0::.
)a!.,l(.)a!.
~:::!~
'" '"
III
ccc l
w
...I
III ~
«
5
..,:.; ~~~
!~
- ........
~i ....
!~
N
..... ....'"
~
, '"
"'~-
0000
66e
1 1 1
1 I ..,
i.,.,
l-
I
~
ccc -00
CS!- 0
"!
0- GO GO-
"'~'"
S~~ ..'"...
II
!
~
~~~
~~
---
~
...'"-
"':,..j",
1
Q II II
"', ...
11
' " CI' .,..
..~~!;~
........
- ...- lii~ l
.... :0;
... ...
ccc c-
§
... ...
...
000
>0
1 I
II
<
(,) --s
~~~ ;;
! GO 00 '"
~
..j in
s: on
!~I
.. -.
~

iill
;::;
::;-~
00
~~ !:!
.., c -.,.....,...
.. ,
"! ...... .c.-
x ...... _._.
~.,.. 0-'" GO 00.,..
0 ~-~ w
- B~
x I x
~ ............
N'~ ........
x x I
-.---.
_. ............
-.
-
~
8 ~
::l::J:::l
8

~i .
II
c
0
.::1
g
000
I I 11
-I·9 ;;;
~
.,..
00
>0

.~
VI

.-~,
~

~
III! '" 0

'"-;. ....;. ...;.....


... ... ... ...
- .. ..
I

". -..,...
"""w
I
~\,6'"
c
:::l
='N;:;-
...... ...., ...... .~
...... ._ ~ ............ ~~=
..... GO
...... ~
N
...... - ..-...-
... ... ...

I
411
experiment, corn was grown in every soil sample at 35°C. as well as at
20 c C. The amounts of phosphorus in the plants, Y', are shown in the last
column of table 13.10.1. Since the inverse matrix is the same for Y' as
for Y, it is necessary to calculate only the new sums of products,
l:x,y' = 1,720.42, l:x,y' = 4,337.56, l:x,y' = 8,324.00

Combining these with the c's already calculated, the regression coefficients
for Y' are
b, ' = 0.1619. b 2 ' = 1.1957, b,' = 0.1155

In the new data, l:9'2 = 6,426, l:d 2 = 12,390 - 6,426 = 5,964, S'2
= 426.0. The siandard errors of the three regression coefficients are 0.556,
0.431, and 0.115. These lead to the three values of t: 0.29, 2.77, and 1.00.
At 35°C., b 2 is the only significant regression coefficient. The interpreta-
tion made was that at 35°C. there was some mineralization of the organic
phosphorus which would make it available to the plants.
The formulas for the standard errors of the estimates in multiple
regression studies are illustrated in examples 13.11.1 to 13.11.3.
EXAMPLE 13.11.1-For soil sample 11, the predicted f was 119 ppm. and the Xl
were: Xl = 14'~'Xl = IS.9,x3 = 79. Find9S%limitsforthepopulationmeanl'ofY. Ans.
The variance of 'P as an estimate of}l is

The expression in the ("s is conveniently computed as rollows:

x,

.0007249 - .0002483 - .0000010 14.9 .006774


- .0002483 .0004375 - .0000330 15.9 .000650
- .0000010 - .0000330 .0000313 79.0 .001933
14.9 15.9 79.0 l;I:clrlxj"'" 0.2640

Border the e'j matrix with a row and a column of the x·s. Multiply each row of the elj in
turn by the Xj. giving tbe sums of products 0.006774, etc. Then multiply this column by the
X" giving the sum of products 0.2640. Since n = 18 and S1 = 399, this gives

Sf' ~ (399)(0.055Q + 0.2640) - 127.5 ; Sf ~ 11.3


With to.os = 2.145, the limits are 119 ± (2.145)(11.3); 95 to 143 ppm.

EXAMPLE 13.11.2.->-lf we are estimating Y for an. individual new observation, the
standard error of the estimate f is

Verify that for a soil with the X -values of soil_17, the S.f. would be ± 22.9 ppm.
412 Chapter 13: Multiple Regression,
EXAMPLE J3.11.3-The following data, kinqly provided by Dr. Gene M. Smith.
come from a class of66 students of nursing. Y repreSents the students' score in an examina-
tion on theory, Xl the rank in high school (a high value being good), X 2 the score on a verba!
aptitude test, and X3 a measure of strength of character. The sums of squares and products
(65 df.) are as follows:

l:x;Xj L\,y I:y'

24,633 2,212 5,865 925,3 6703


7,760 2,695 745,9
28,432 1,537,8

(i) Show that the regression coefficients and their standard errors are as follows:
b, ~ 0,0206 ± 0,0192; b, ~ 0,0752 ± 0,0340; b, ~ 0,0427 ± Om80
Which X variables are related to performa~ce in theory?
(ii) Show that the F value for the three-variable regression is F = 5.50. What is the P
value'!
(ii) Verify that R2 = 0.210.

B.ll-Deletion of an independent variable. After a regression is


computed, the utility of a variable may be questioned and its omission
proposed. Instead of carrying out the calculations anew, the regression
coefficients and the inverse matrix in the reduced regression can be
obtained moce quickly by the following formulas (14), We suppose
that XI<' is the variable to be omitted from a regression containing
Xl ... Xk • Before omission, the Deviations mean square ..,2 has
(n - k - I) df-
When X" is omitted, the sum of squares of deviations from the fitted
regression, :Ed', is increased by b.' Ie .., The mean square of the deviations
then becomes
s" = (:Ed' + b.'lc ..)/(n - k)
Further, the regression coefficients and the inv~rse multipliers become
b/ = hi - cj..bJc ....

13,I3-Selection of ,'ariates for prediction, A related but more diffi-


cult problem ari~e~ when a regression is.bemg constructed for purposes of
prediction and it is thought that several of the X~variable~. perhaps most
of them, may contrihute little or nothing to the accuracy of the prediction.
For instance. we may start with 11 X-variables, but a suitable choice of
three of them might give tho best predictions, The problem is to decide
how many \ afiahle.;; to retain. and which ones.
The "most thorough approach is to work out the regression of Y
on every subset of Ihe k X·variables, that is, on each variable singly, on
413
every pair of variables, on every triplet, and so on. The subset that gives
the smallest Deviations mean square s' could be chosen, though if this
subset involved 9 variables and anotber subset with 3 variables looked
almost as good. the latter might be preferred for simplicity. ·The draw-
back of this method is the amount of computation. The number of regres-
sions to be computed is 2' - I, or 2,047 for II X-variables. Even with an
electronic computer, this approach is scarcely feasible if k is large.
Two alternative approaches are the step up method and the step down
method. In the step down method, the regression of Yon all k X-variables
is calculated. The contribution of X, to the reduction in sum of squares of
Y, after fitting the other variables, is b,' Ie". The variable X" for which this
quantity is smallest is selected, and some rule is followed in deciding
whether to omit XI.!" One such rule is to omit Xu if b//s 2 cuw < I: others
omit X" if b" is not significant at some chosen level. If X" is omitted, the
regression of Y on th~ remaining (k - 1) variables is computed, and the
same rule is applied. The process continues until no variable qualities
for omission. ,
In the step up method we start with the regressions of Yon X I, .. "X,
taken singly. The variable giving the greatest reduction in sum of ~quares
of Y is selected. Call this Xl' Then the bivariate regressions in which XI
appears are worked out. The variate which gives the greatest additional
reduction in sum of squares after fitting XI is selected. Call this X,. All
trivariate regressions that include both XI and X, are computed, and the
variate that makes the greatest additional contribution to them is selected.
and so on until this additional contribution b//c" is too small to satisfy
some rule for inclusion.
It is known that the step up and the step down methods will not neces-
sarily select the same X-variables, and that neither method guarantees to
find the same variables as the exhaustive method of investigating every
subset. Striking differences appear mainly when the X-variables are highly
correlated. The differences are not necessarily alarming, because when
intercorrelations are. high, different subsets can give almost equally good
predictions. Fuller accounts of these methods, with illustrations, appear
in (15, 16). '
Two aspects of this problem require further research. For a given
approach, e.g., the step down meth6d, the best rule to use in deciding
whether to omit an X-variate is not clear. Naturally, all simple rules
reject Xi if at some stage b//c" is small enough. Suppose that fl, = + 1.
Then X, may be rejected because this sample gave an unusually low esti-
mate of b" say OJ. Nevertheless, with p, = + I a prediction formula that
includes a term 0.3X, may give better predictions in the population than
one which has no term in Xi' For this reason sOme writers recommend
retaining the term in X, if the investigatoris contident from his knowledge
of the mechanism involved that fl, must be positive and if b i is also posi-
tive.
Secondly. these methods tend to select variables that happen to do
unusually well in the sample. When applied to new matenal, a prediction
414 Chapt.r 13: Multiple R."r80ian
formula selected in this way will not predict as accurately as the value of
.. ' suggests, especially if the sample is small and many X's have been
rejected. More information is needed on the extent of this loss of ac-
curacy,

13.14-The discriminant function. This is a multivariate technique


for studying the extent to which different populations overlap one another
or diverge from one another. It has three principal types of use.
I.' Classification and diagnosis. The doctor's records of a person's
symptoms and of his physical and laboratory measurements are taken
to guide the doctor as to the, particular disease from which the person is
suffering. With two diseases that are often confused, it is helpful to learn
what measurements are most effective in distinguishing between the
conditions, how best to combine the measurements, and how successfully
the distinction can be made.
2. In the study of the relations betll'een populations. For example.
to what extent do the aptitudes and attitudes of a competent architect
differ from those of a competent engineer or a competent banker'? Do
non-smokers, cigarette smokers, pipe smokers. and cigar smokers dift~r
markedly or only negligibly in their psychological traits"
3. As a multivariate generali=ation a/the (-test. G"iven a number of
related measurements made on each of two groups, the investigator may
want a single test of the null hypothesis that the two populations have the
same means with respect to all the measurements.
Historically, it is interesting that the discriminant function was de-
veloped independently by Fisher (17). whose primary interest was in
classification. by Mahalanobis (18), in connection with a large study of
the relations between Indian castes and tribes. and by HotelIing (19),
who produced the multivariate Hest.
This introduction is confined to the case of two populations. Con-
sider first a single variate X, normally distributed, with known means 1'1.
1', in the two popul,ations and known standard deviation (1, assumed
the same in both populations. The value of X is measured for a new speci-
men that belongs to one of the two populations. Our task is to classify
the specimen into the correct popUlation. If 1'1 < 1'" a natural classifica-
tion rule is to assign the specimen to population 1 if X < (1'1 + 1',)/2
and to popuhtion II if X > (1'1 + 1,,)/2. The mean of the two popula-
tions serves as the boundary point.
How often will we make a mistake') If the specjmen actually comes
from population I. our verdict is wrong whenever X > \1'1 + 1',)/2; that
is, whenever

X-I'I
-~~>
11

where b = (I', - 1'1) is the distance between the two means.


415

Since (X - /11)/a follows the standard normal distribution, the prob-


ability of misclassification is the area of the normal tail from 0/2<1 to 00,
It is easily seen that the same probability of misclassification holds for a
specimen from population ll, Some value. of this probability for given
0/<1 are as follows:

Sfu 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0

Probability (%) 40.1 30.8 22.7 15.9 10.6 6.7 4.0 2.3

For a high degree of accuracy in classification, 0/<1 must exceed 3. The


same quantity 0/<1 can be used as an index of the degree of overlap be-
tween the two populations: it is sometimes called the distance between
the populations.
In some classification problems it is known from experience that
specimens come more frequently from one population than from the
other. Further, misclassifying a specimen that actually comes from popu-
1ation J may have more serious consequences than misclassifying a speci-
men from population II. If these relative frequencies and relative costs
of mistakes are known, the boundary point is shifted to a value at which
the average cost of mistakes is minimized (20).
We come now to the multivariate case. The variates XI ... X, are
assumed to follow a multivariate normal distribution. The variance
O'jjof X j and the covariance O'jj of Xi and Xj are assumed to be the same in
both populations. Of course, a" is not assumed to be the same from one
variate to another, nor a ij from one pair of variates to another. The
symbol 0, = /1" - /1" denotes the difference between the means of the
two populations for X,.
The linear discriminant junction :EL,X, may be defined as the linear
function of the X, that gives the smallest probability of misclassification.
The L, are coefficients that will be determined to order to satisfy this re-
quirement. Since the X, follow a multivariate normal, it is known from
theory that :EL,X, is normally distributed. The difference between its
means in the two populations is 5 = :ELio i and its variance is
2
cr = I:.I.LjLpij· . -...
From the earlier discussion for a single variate, it is clear that we
must maximize the absolute value of 0/<1 in order to minimize the probabil-
ity of misclassification. To avoid the question of signs, the L, are chosen
so as to maximize ~2/a2; that is,
(13.14.1)
The quantity t,2 is called the generalized squared distance. By calculus
Ihe L, are found to be the solutions of the set of k equations
"11L I + 1112L2 + '" + GuLA; = 61
(13.14.2)
416 Chaple, 13: MV/lip/.. Regreuion
An interesting consequence of the solution is that ;:,.' = 'f.L;b; when the
optimum L; from (13.14.2) are inserted.
The estimation of the linear discriminant function from sample data is
illustrated in section 13.15 below.
13.1S-N urnerical example of the discriminant function. This exam·
pIe, due to Cox and Martin (22). uses data from a study of the distribution
of Azotobacter in Iowa soils. The question is: how well can the presence
or absence of Azotobacter be predicted from three chemical measurements
on the soil? The measurements are:
X. = soil pH
X, = amount of readily available phosphate
X, = total nitrogen content
The data consist of 100 soils containing no Azotobacter and 186 soils
containing Azotobacter. For ease in calculation, the data were coded by
dividing X., X,. X, by 10, 1,000. and 100. respectively. The original data
will not be given here.
It is always advisable to look first at the discriminating powers of
the individual variates. The Within Sample mean squares s;' (284 df)
and the d; (differences between the sample means) were computed for
each variate. The ratios dis were 2.37 for X., 1.36 for X,. and 0.81 for
X,. Evidently X. is the best single variate, giving a probability of mis-
classification of 11.8%, while X, is poor by itself. A result worth noting is
that if the variates were independent. th"Yall1,,-~f d/s given by. the dis-
crimiQant function would be simply J{'f.(d;/s;f). or in this example
J8.12 = 2.85, with an error rate of about 7.7%. In practical applications,
correlations between the X's usually have the effect of making the dis-
criminant function less accurate (21). IL}l!uompl!1~ discritl!inant
appears to give 1I <!!S_Ill~ch gte.ater than the, value obtained by assuming
inaependence. the cO!ll'putations should be checked ..
--ro computethe discrimina~nT,lind the pooled Within Sample sums of
squares S;; and sums of products SY_' If the sample sizes are n., n" the
degrees of freedom are (n. + n, - L). In line witil equations (13.14.2)
the normal equations to be solved are as follows:

(13.15.1)
SklL, + SilL, + ... + S .. L. = d.
(If we were to copy [13.14.2J as closely as possible, the mean squares and
products S;j :would be used in [13.15.1 J inste~d of the S;;, but the S;)
give the same results in the end and are easier to use.)
Equations (13.15.1) obviously resemble the normal equations for the
regression coefficients in multiple regression. The L; take the place of the
h ;, and the d; of the 'f.xcv. The resemblance can be increased by con-
structing a dummy variable Y, which has the value + l/n2 for every mem-
ber of sample 2 and - I/n. for every mel)1ber of sample I. It follows that
4'7
EX, Y = EXJ' = d,. Thus, formally, the discriminant function can be
regarded as the multiple regression of this dummy Y on X, ... X,. If
we knew Y for any specimen we would know the population to which the
specimen belongs. Consequently, it is reasonable that the discriminant
function should try to predict Y as accurately as possible.
For the two sets of soils the normal equations are:
l.lIILI + 0.229L 2 + 0.198L, = 0.1408
0.229L, + 1.043L2 + 0.051L, = 0.0821
0.198L, + 0.051L 2 + 2.942L, = 0.0826
The L .. computed by tbe method of section 13.10, are:
L, = 0.11229, L2 = 0.05310, L, = 0.01960
The value of dis for the discriminant is given by the formula:
../("1 + "2 - 2)"f.L,d, - ";(284)(0.02179) = ,/6.188 = 2.49
This gives an estimated probability of misclassification of 10.6%. In
these data the combined discriminant is not much better than pH alone.

TABLE 13.15.1
ANALYSIS OF VARlANC£ OF THE DlSCRlMlNAN"I FUNCT10N. HOTaLING'S T2-TFST

Degrees of Mean
Source of Variation Freedom Sum of Squares Square

Between soils 3 n 1n2(l:Ld)2/(n l + n2) = 0.03088 0.01029


Within solis 2&2 , l:[..d _ 0.02179 0.0000773
-------'----------------------:;~.~.
F - 0.01029/0.0000773 ~ 133.1.

The multivariate Hest, Hotelling's T2 test, is made in table 13.15.1


from an analysis of variance of the variate "f.L,X, into "Between Samples"
and "Within Samples." On multiplying equations (13.15.1) by L"
L 2 , ••. L. and adding, we have the result: '
Within Samples sum of squares = "f."f.L,LJS'i = "f.L,J,
The "Between Samples" sum of squares = n'"2("f.L,d,)'/(n 1 + "2)
Note the df.: k for Between Samples and (n, + n, - k - I) for Within
Samples. The allocation of k df. to Between Samples allows for the fact
that the L's were chosen to maximize the ratio of the Between Samples
S.S. to the Within Samples S.S. The value of F, 133.1. with 3 and 282
df. is very large, as it must be if the discriminant is to be effective in classifi-
cation.
The assumption that the covariance matrix is the same in both popu-
lations is rather sweeping. If there appear to be moderate differences
418 Chapter 13: Muhiple Regr...""
between tpe matrices in the two populations and if n 1 and n2 are unequal,
it is better when computing the coefficients L, to replace the sums of
squares and products S'i by the unweighted averages s'i of the variances
or covariances in the two samples. If this is done, note that the value of
dis for the discriminant becomes J!:.(Ld), while In table 13.15.1, 'I:.(Lti}
becomes the Within Samples mean square. The expression for the Be-
tween Samples sum of squares remains as in table 1315.1. When the
covariance matrices differ substantially, the best discriminant is a quad-
ratic expression in the X's. Smith (23) presents an example of this case.
For classification studies involving more than two populations, see
Rao (20). Examples are given in (24,25) for qualitative data, in which the
assumption of a multivariate normal population does not apply.
REFERENCES
1. M.'T. Em, C. A. BLACK, O. KEMPTHORNE. and J. A. ZoELLNER. Iowa Agric. Exp. Sta.
Res. Bul. 406 (1954).
2. R. A. FISHER. Philos. Trr.1ns., B 213:89 (1924).
3. N. V. SMIRNOV. Tables for the Distribution and Density Functions of (-distribution.
Pergamon, New York (1961).
4. G. E. P. Box. Technometrics, 8 :625 (1966),
5. O. KEMPTHORNE. Design and Analysis of Experiments, Wiley. New York (19'52).
6. R. A. FISHER. The Design of Experiments. Oliver and Boyd. Edinburgh (l936).
7. H. HOTELLING. A.nn. Math. Sratist., II :271 (1940).
8. E. J. WILLIAMS. Regression Analysis: Wiley. Inc. New York (1959).
9. A. M. MOOD and F. A. GRAYBILL. Introduction 10 the Theory of Statistics. 2nd ed.,
McGraw·HiII, New York (1963).
10. M. H. DooLITTLE. U.S. Coast and Geodetic Survey Report: liS (1878).
11. P. S. DWYER. Linear Computations. Wiley. New York (1951).
12. P. P. SWANSON, R. L~VERTON. M. R. GRAM, H. ROBERTS. and 1. PESEK. J. Gerontology,
10:41 (1955).
13. A M. BRUNSON,and J. G. W'LLIER. J. Amer. Soc. Agron .• 21 :912 (J929).
14. w. G. COCHRAN. J. R. Statist. Soc. Supp., 5: 171 (1938).
15. N. DR.APER and H. SMITH. Applied Regression Analysis. Wiley, New York, Chap. 6
(1966).
16. H. C. HAMAKER. Statist. Neer/andica. 16:31 (1962).
17. R. A. FISHER. Ann. Eugenics. 7: 179 (1936).
18. P. c. MAHAlANOBIS. J. Asiatic Soc. Bengal, 26:541 (1930),
19. H. HOTELLING. Ann. Math. Statist., 2:360 (1931).
20. C. R. RAo. Adlranced Statistical Methods in Biometric Rertarch. Wiley, New York,
Chap. 8 (1952).
21. w. G. COCHRAN. l'echnornetrics. 6: 119 (1964).
22. G. M. Cox and W. P. MAlnIN. Iowa State Col/ege Jour. Sci., II: 323 (1937).
23. C. A. 8. SMlTH. Biomath(?matics. Charles Griffin, London. (1954).
24. W. O. COCHRAN and C. E. HOPKINS. Biometrics, 17'10 (1961).
25. A. E. MAXWELL. Analysing Qualitatire Data. Methuen. London. Chap. 10. (1961).
26. S. WRJGHT. Biometrics, 16: 189 (1960).
27. O. D. DuNCAN. Amer. J. Sociol., 72: I {1966).
28. R. A. GROUT,loWD Agric. Exp. 510. Res. BuI., 2)8 (J937).
* CHAPTER FOURTEEN

Analysis of covariance

14.1-lntroductioo. The analysis of covariance is a technique that


combines the featu{es of analysis of variance and regression. In a one-
way classification, the typical analysis of variance model for the value
YIJ of the jth observation in the ith class is

y,} = J'i + efj


where the p, represent the population means of the classes and the e'j are
the residuals. But suppose that on each unit we have a:lso measured
another variable XCi that is linearly related to Y,j. It is natural to set up the
model, .
Yij = p, + (J(X. I - X .. ) + tip
where {J is the regression coefficient of Y on X. This is a typical model"
for the analysis of covariance. If X and Yare closely related, we may
expect this model to fit the Y,. values better than the original analysis of
variance model. That is, the'residuals tij should be in general smaller
than the eIJ'
The model extends easily to more complex situations. With a two-
way classification, as in a randomized blocks experiment, the model is
Y'I = P + ~i + Pj + (J(Xij - X .. ) + e,j
With a one-way classification and two auxiliary variables Xlij and Xlij'
both linearly related to Y'j' we have
Y,j = p, + (J,(XIi! - X, .. ) + (J'(X,,! - X, .. ) + t'l
The analysis of covariance has numerous uses.
J. ,To increase precision in randomized experiments. In such applica-
tions the covariate X is a measurement, taken on each experimental unit
before the treatments are applied, that predicts to some degree the final
response Yon the unit. In the earliest application suggested by Fisher
(I), the Yij were the yields of tea buslies in an experiment. An important
419
420 Chapter 14: Analysis of Covariance
source of error is that by the luck of the draw, some treatments will have
been allotted to a more productive set of bushes than others. The Xi}
were the previous yields of the bushes in a period before treatments were
applied. Since the relative yields of tea bushes show a good deal of sta-
bility from year to year, the Xii serve as predictors of the inherent yielding
abilities of the bushes. By adjusting the treatment mean yields so as to
remove these differences in yielding ability, we obtain a lower experi-
mental error and more precise comparisons among the treatments. This
is probably the commonest use of covariance.
2. To adjust for sources of bias in observational studies. An investi·
gator is studying the relation between obesity in workers and the physical
activity requir~d in their occupations. He has measures of obesity Yij in
samples of worllers from each of a number of occupations. He has also
recorded the age X,i of each worker, and notices that there are differences
between the mean ages of the workers in different occupations. If obesity
is linearly related to age, differences found in obesity among different
occupations may be due in part to these age differences. Consequently he
introduces the term P(Xij - X .. ) into his model in order to adjust for a
possible source of bias in his comparison among occupations.
3. To throw light on the nature of treatment effects in randomized
experimems. In an experiment on the effects of soil fumigants on nema-
todes, which attack some farm crops, significant differences between
fumigants were found both in the numbers of nematode cysts Xii and in
the yields Yij of the crop. This raises the question: Can the differences in
yields be ascribed to the differences in numbers of nematodes? One way
of examining this question is to see whether treatment differences in yields
remain, or whether they shrink to insignificance, after adjusting for the
regression of yiel4s on nematode numbers.
4. To study regressions in multiple classifications. For example, an
investigator is studying the relation between expenditure per student in
schools (Y) and per capita income (X) in large cities. If he has data for
a large number of cities for each of four years, he may want to examine
whether the relation is the same in different sections of the country, or
whether it remains the same from year to year. Sometimes the question
is whether the relation is straight or curved.
14.2-Covariance in a completely randomized experiment. We begin
with a simple example of the use of covariance in increasing precision in
randomized experiments. With a completely randomized design, the
data form a one-way classification, the treatments being the classes. In
the model
Y,i = I'i + P(X,j - X .. ) + Bij,
the I'i represent the effects of the treatments. The observed mean for the
ith treatment is
421

F,. = 1', + P(X,. - X .. ) + e,.


Thus Y;. is an unbiased estimate of
1', + P(X,. - X .. )
It follows that as an estimate of 1', we use
fl, = Y,. - P(X,. - X .. ),
the second term on the right being the adjustment introduced by the co-
variance analysis. The adjustment accords with common sense. For
instance, supPoie we were told that in the previous year the tea bushes
receiving Treatment I yielded 20 pounds more than the average over the
experiment. If the regression coefficient of Y on X was 0.4, meaning
that each pound of increase in X corresponds to 0.4 pound of increase in
Y, we would decrease the observed Y mean by (OA)(20) = 8 pounds in
order to make Treatment I more comparable to the other treatments. In
this illustration the figure 0.4 is fl and the figure 20 is (X,. - X .. ).
There remains the problem of estimating P from the results of the
experiment. In a single sample you may recall that the regression coeffi-
cient is estimated by b = Exy/Ex'. and that the reduction in sum of
squares of Y due to {he regression is (Exy)'/Ex'. These results continue,
to hold in multiple classifications (completely randomized, randomized
blocks and Latin square designs) except that P;s estimated/rom the Error
line in the analysis of l'ari{1nce, We may write b·= Exy!Ex;x. The Error
sum of squares of X in the analysis of variance. E:u;, is familiar. but the
quantity Exy is new. It is the Error sum of products of X and Y. A
numerical example will clarify it.
The data in table 14.2.1 were selected from a larger experiment.GIT"
the use of drugS in the treatment of leprosy at the Eversley Childs'Sani-
tarium in the Philippines. On each patient six sites on the body at which
leprosy bacilli tend to congregate were selected. The variate X, based on
laboratory tests. is a score representing the abundance of leprosy bacilli
at these sites before the experiment began. 'The variate Y is asimilar score
after several months of treatment. Drugs A ilhd D are antibiotics while
drug F is an inert drug included as a control. Ten patients were selected
for each treatment for this example.
The first step is to compute the analysis of sums of squares and
products, shown under the table. In the columns headed LX' and Ey', we
analyze X and Y in the usual way into "Between drugs" and "Within
drugs." For the' Exy column, make the corresponding a)'alysis of the
products of X and Y, as follows:

Total: (11)(6) + (8)(0) +.,. + (12)(20) - (322)(237)/30 = 731.2

Between drugs: (93)(53) +(I_!J_~)j(~I~+_.(129~23) _ (322~~237) 145.8


422 Chapt.r J4: Analysis of Covariance
TABLE 14.2.1
ScORES FOR I,EPROSY BACILLI BEFOkE (X) AND AFTER (y) TREATMENT

Drugs
,
A
-------,--,

D
---_
F
-II
- ----~--------

X Y X Y X y
II
8
6
0
6
6
0
2
16
13
13
10
I
5 2 7 3 II 18
14 8 8 I 9 5 I
19 II 18 18 21 23
6 4 8 4 16 12 I
10 !3 19 14 12 5 I

j
6 I 8 9 12 16
II 8 5 I 7 I Overall
3 0 15 9 12 20
X Y
Totals 93 53 100 61 129 123 322 237
Means 9.3 5.3 10.0 6.1 12.9 12.3 10.73 7.90

Analysis of Sums of Squares and Products


._._---_.
Source dj l:x' l:xy l:y'
-------~--.

Total 29 665.9 731.2 1.288.7


Between drugs 2 73.0 145.8 293.6
Within drug~ (Error) 27 592.9 585.4 995.1
Reduction due to regression I (585.4)'/592.9 - 578.0
Deviations from, regression 26 417.1
--------------- - - -
Deviations mean square = 417.1/26 = 16.04
The Within drugs sum of. products. 585.4. is found by subtraction. Note
that any of these sums of products may be either positive or negative.
The Within drug~ (Error) sum o(products 585.4 is the quantity we call
E." while the Error sum of squares of X, 592.9, is En.
The reduction in the Error sum of squares Y due to the regression is
E.,' / En with I df. The Deviations mean square, 16.04 with 26 df.,
provides the estimate of error. The original Error mean square of Y is
995.1/27 = 36.86. The regression has produced a substantial reduction
in the Error mean square.
The next step is to compute h and the adjusted means. We have
h = E.,/E.. = 585.4/592.9 = 0.988. The adjusted means are as follows:
A: f,,-h(X,.-X .. )= 5.3-(0.988)( 9.3-10.73)= 6.71
D: f 2 • _. b(X 2 • _ X .. ) = 6.1 - (0.988)(10.0 _ 10.73) = 6.82
F: 1,. _ b(X,. _ X .. ) = 12.3 - (0.988)(12.9 - 10.73) = 10.16

bave imprOVed the status of F, which happened to receive initially a set


of patients with somewhat high scores.
423
For tests of significance or confidence limits relating to the adjusted
means, the error variance is derived from the mean square SY'x 2 = 16.04.
with 26 df. Algebraically, the difference between the adjusted means of
the ith and the jth treatments is
D= v,. - 1';. - b(X,. - Xj .)
The formula for the estimated variance of D is

SD
,_ .
- Sit' l:
2 {2 + '--'-::,-'-'--
-
(X,. - )(j')'} (14.2.1)
n Exx
where n is I~" sample size per treatment. The second term on the right is
an allowance for the sampling error of b.
This formula has Ihe disadvantage that SD is different for every pair
of treatments that are being compared. In practice, these differences are
small if(i) there are alleast 20 df in Ihe Error line of the analysis ofvari-
ance, and (ii) Ihe Treatments mean square for X is non-significant. as it
should be since the X's were measured bef!>re treatments were assigned.
In such caSes an average value of SD 2 may be used. By an algebraic identity
(2) the average valUe of SD 2 , taken over every pair of treatments, is

S2'=~s
D n ,'x2~I+tE.. ] (1422,
..
xx

where exx is the Treatments mean square for X. More generally. we may
regard

s .2 = Sy.,r 2 [I + e xx
.E ]
xx
(14.2.3)

as the effective Error mean square per observation when computing the
error variance for any comparison among the treatment means.
In this experiment Ixx = 73.0/2 = 36.5 (from table 14.2.1). E.. = 592.9.
giving I ../E.. = 0.0616. Hence.
S·2 = (16.04)(1.0616) = 17.03 :' s· = 4.127
With 10 replicates this gives SD = 4.127,,/(0.2) = 1.846. The ad-
justed means for A and D. 6.71 and 6.82, show no sign of a real difference.
The largest contrast. F - A. is 3.45. giving a t-value of 3.45/1.846 = 1.87.
with 26 df. which is not significant at the 5~~ level.
After completing a covariance analysis, the experimenter is sure to
ask: Is it worthwhile? The efficiency of the adjusted means relative to the
unadjusted means is estimated by the ratio of the corresponding effective
Error mean squares:
s,' = 36.86 216
S'2
SJ'"X
2 e+
S,2
exx
E
] = 17.03 = .
xx
424 Chapter 14: Analysis of Covariance
Covariance with 10 replicates per treatment gives nearly as precise
estimates as the unadjusted means with 21 replicates.
In experiments like this, in which X measures the same quantity as
Y(score for leprosy bacilli), an alternative to covariance is to use (Y - X),
the change in the score, as the measure of treatment effect. The Error
mean square for (Y - X) is obtained from table 14.2.1 as
Eyy - 2Ex, + Ex> = [995.1 - 2(585.4) + 592.9] = 15.45
27 27
This compares with 17 .03 for covariance. In this experiment, use of
( y - X) is slightly more efficient than covariance as well as quicker com-
putationally. This was the recommended variable for analysis in the
larger experiment from which these data were selected. In many experi-
ments, (Y - X) is inferior to covariance, and may also be inferior to Y
if the correlation between X and Y is low.

14.3-1110 F-test of tho adjusted means. Section 14.2 has shown how
to make comparisons among the adjusted means. It is also possible to
perform an F-test of the null hypothesis that all the 1'; are equal-that
there are no differences among the adjusted means. Since the way in which
this test is computed often looks mystifying, we first explain its rationale.
First we indicate why b is always estimated from the Error line of the
analysis of variance. Suppose that the value of b has not yet been chosen.
As we have seen. the analysis of covariance is essentially an analysis of
variance of the quantity (Y - bX). The Error sum of squares of this
quantity may be written

Ey , - 2bExy + b2 En
Completing the square on b, the Error S.S. is

Exx (b -
Ex,)' + E" - E
E Ex..' (14.3.1)
xx xx

By the method of least squares, the value of h is selected so as to minimize


the Error 5.S. From (14.3.1), it is obvious that this happens when
b = Ex)Ex~' the minimum Error S.S. being E~,_" - Ex//E~x'
Now to the F-test. If the null hypothesis is true~ a covariance model
in which 1'; = I' should fit the data as well as the original covariance model.
Consequently, we fit this Ho model to find how large an Error S.S. it
gives. In the analysis of sums of squares and products for the Ho model,
the "Error" line is the sum orthe Error and Treatments line in the original
model. because the Ho model contain~o treatment effects. Hence, the
Deviations S.S. from the Ho model is"''',,:~
(E.n + Tx~f
E.r_.' + 1'.1',1' - E';_.-r.
+
xx xx
(14.3.2)
425
If Ho bolds, the difference between the Deviations S.S. for the Ho
model and the original model, when divided by the difference in degrees
offreedom, may be shown to be an estimate of <1,.. / in the original model.
If Ho is false. this mean square difference becomes large because the Ho
model fits poorly. This mean square difference forms the numerator
of the F-test. The denominator is the Deviations mean square from the
original model.
In table 14.3.1 the test is made for the leprosy example. The first
step is to form a Treatments + Error line. (In a completely randomized
design this line is, of course, the same as the Total line, but this is not so
in randomized blocks or a Latin square.) Following formula (14.3.2) we
subtract (731.2)2/665.9 = 802.9 from 1288.7 to give the deviations S.S.,
485.8, for the Ho model. From this we subtract 417.1, the deviations
S.S. for the original model, and diVide by the difference in dI, 2. The
F-ratio, 34.35/16.04 = 2.14, with 2 and 26 dI, lies between the 25% and
the 10% levels.
TABLE 14.3.1
THE COVARIANCE F-TEST IN A ONE-WAY CLASSIFICATION. LEPROSY DATA

I'De~ations
, Fr::' -~~grcssio~-
Degrees of
Freedom I:x 1 I:.\}'

Treatments 2 73.0 145.8


Error 27 591.9 585.4 26 417.1 16.04
------
T+E 29 665.9 731.2 28 485.8

:2 68.7 .14 ..15

""_"-Covariance in a t...........y classification. The computation. in-


volve nothing new. The regression coefficient is estimated from the Error
(Treatments x Blocks) line in the analysis of sums of squares and prod-
ucts, and the F-test of the adjusted treatment mean:; is made by recomput-
ing the regression from tbe Treatments plus Error'lines. following the
procedure in section 14.3. To put it more generally for applications in
wbich the words "Treatments" and "Blocks" are inappropnate. the
regression. coefficient is estimated from the Rows x Columns line, and
eitber the adjusted row means or the adjusted column means may be
tested. Two examples from experiments will be presented to illustrate
points that arise in applications.
The data in table 14.4.1 are from an experIment on the effects of
two drugs on mental activity (13). The mental activity score was the sum
of the scores on seven items in a questionnaire given to each of24 volunrc!er
subjects. The treatments were morphine, heroin. and placebo tan Inert
substance), given in subcutaneous injections. On different oCL'Ll.<>ions. each
426 C/tapl.r 14: AItaIyris of Covan-.:e
TABLE 14.4.1
MENTAL AcnVITY Srous BEFOIlE (X) AND Two HOUlU AFTER (y) A DIlUG

Morphine Heroin Placebo Total


Subject X Y X Y X Y X Y

I 7 4 0 2 0 7 7 13
2 2 2 4 0 2 I 8 3
3 14 14 14 13 14 10 42 37
4 14 0 10 0 5 10 29 10
5 I 2 4 0 5 6 10 8
6 2 0 5 0 4 2 11 2
7 5 6 6 I 8 7 19 14
8 6 0 6 2 6 5 18 7
9 5 I 4 0 6 6 IS 7
10 6 6 10 0 8 6 24 12
II 7 5 7 2 6 3 20 10
12 I 1 4 I 3 8 8 12
13 0 0 I 0 I 0 2 0
14 8 10 9 I 10 II 27 22
IS 8 0 4 13 10 10 22 23
16 0 0 0 0 0 0 0 0
17 II I II 0 IO 8 32 9
18 6 2 Ii 4 6 6 18 12
19 7 9 0 0 8 7 IS 16
20 S 0 6 I S I 16 2
21 4 2 II 5 10 8 25 IS
22 7 7 7 7 6 S 20 19
23 0 2 0 0 0 I 0 3
24 12 12 12 0 II 5 35 17

Total 138 88 141 52 144 133 423 273


I
Degrees of
Freedom :Ex' :EX}' :Ey'
Between subjects 23 910 519 558
Between drugs 2 I 5 137
Error .46 199 -16 422

Total 71 1,110 508 1,117

subject received each drug in turn. The mental activity was measured
before taking the drug (X) and at 1/2, 2, 3, and 4 hours after. The re-
sponse data (Y) in table 14.4.1 are those at two hours after. As a com-
mon precaution in these experiments, eight subjects took morphine first,
eight took heroIn first, and eight took the placebo first, and similarly on
the second and third occasions. In these data tbere was no apparent
effect of the order in which drugs were given, and the order is ignored in
the analysis of variance presented here.
In planning this experiment two sources of variation were reeog.
nized. First, .there are consistent differences in level of mental activity
427
between subjects. This source was removed from the experimental error
by the device of having each subject test all three drugs, so that com-
parisons between drugs are made within subjects. Secondly. a subject's
level changes from time to time-he feels sluggish on some occasions
and unusually alert on others. Insofar as these differences are measured
by the pretest mental activity score on each occasion, the covariance
analysis should remove this source of error.
As it turned out, the covariance was ineffective in this experiment.
The error regression coefficient is actually slightly negative, b = - 16/199,
and showed no sign of statistical significance. Consequently, comparison
of the drugs is best made from the 2-hour readings alone in this case.
Incidentally, covariance would have been quite effective in removing
differences in mental activity between subjects, since the Between sub-
jects h, 519/910, is positive and strongly significant.
Unlike the previous leprosy example, the use of the change in score,
2 hours - pretest, would have been unwise as a measure of the effects of
the drugs. From table 14.4.1 the Error sum of squares for (Y - X) is
422+ 199-2(-16)=653
This is substantially larger than the sum of squares, 422, for Yalone.
The second example, table 14.4.2, illustrates another issue (3). The
experiment compared the yields Y of six varieties of corn. There was some
variation from plot to plot in number of plants (stand). If this variation
is caused by differerices in fertility in different plots and if higher plant
numbers result in higher yields per plot, increased precision will be ob-
tained by adjusting for the covariance of yield on plant number. The plant
numbers in this event serve as an index of the fertility levels of the plots.
But if some varieties characteristically have higher plant numbers than
others through a greater ability to germinate or to survive when the plants
are young, the adjustment for stand distorts the yields because it is trying
to compare the varieties at some average plant number level that the
varieties do not attain in practice.
With this in mind, look first at the F-rlrtio for Varieties in X (stand).
From table 14.4.2 the mean squares are: Varieties 9.17, Error 7.59, giving
F = 1.21. The low value of F gives assurance that the variations in stand
are mostly random and that adjustment for stand will not introduce bias.
In the analysis, note the use of the Variety plus Error line in comput-
ing the F-test of the adjusted means. The value of Fis 645.38/97.22 = 6.64,
highly significant with 5 and 14 df The adjustment produced a striking
decrease in the Error mean square, from 583.5 to 97.2, and an increase in
F from 3.25 to 6.64.
The adjusted means will be found to be:
A, 191.8; 8, 191.0; C. 193.1; D,219.3; E, 189.6; F.213.6
The standard error oflhe difference between two adjusted means is 7.25,
with 14 df By either the LSD method or the sequential Newman-Keuls
428 Chapter 14: Analysis of Co-;once

TABLE 14.4.2
STAND (X) AND YIELD (Y) (PoUNDS FIELD WEIGHT OF EAR CORN) OF SIX VARlETlES Of
CORN.' COV",RIANCE IN RANDOMIZED BLOCKS

Blocks

I 2 3 4 Tolal

Varieties I X Y X Y X Y X y X Y

A 28 202 22 165 27 191 19 134 96 692


B 23 145 26 201 28 203 24 180 101 729
C 27 188 24 185 27 185 28 220 106 778
D 24 201 28 231 30 238 30 261 112 931
E 30 202 26 178 26 198 29 226 III 804
F 30 228 25 221 27 207 24 204 106 860

Tolal 162 1,166 151 1,181 165 1,222 154 1,225 632 4,794

I Deviations From Regression


Source of Sum or Mean
Variation df. :Ex' Ixy I1" df. Squares Square

Total 23 181.33 1,485.00 18,678.50


Blocks 3 21.67 8.50 436.17
Varieties 5 45.83 559.25 9,490.00
Error 15 113.83 917.25 8,752.33 14 1,361.07 97.22
I
Variety plus error 20 159.66 1,476.50 18,242.33 19 4,587.99

For testing adjusted means, 5 3,226.92 645.38··

method, the two highest yielding varieties, D and F, are not significantly
different, but they are significantly superior to all the others, which do not
differ significantly among themselves.
In some cases, plant numbers might be influenced partly by fertility
variations and partly by basic differences between varieties. The possi-
bility of a partial adjustment has been considered by H. F. Smith (4).
EXAMPLE 14.4.1-Verify the adjusted means in the corn experiment and carry
through the tests of all the differenCes.
EXAMPLE 14.4.2-Estimate the efficiency of the covariance adjustments Ans.5.55.
EXAMPLE 14.4.3-As an alternative to covariance. could we analyze the yield per
plant. Y,.' X. as a means of removing differences in plant numbers? Ans. This is satisfactory
if the relation between Yand X is a straight line going through the origin. But b is often sub-
stantially less than the mean yield per plant. because when plant numbers are high. competi-
tion between plants reduces the yield per plant. If this happens. the use of Y/X overcorrects
for stand. In the corn example b = 8.1 and the overall yield per plant is 4,794/632 = 7.6.
in good agreement: Yield per plant would give results similar to covariance. Of course,
YIeld per plant should tJe analyzed if there is direct interest in this quantity.
429
EXAMPLE 14.4.4-The foUowing data are the yie1ds (Y) jn bushels per acre and the
per cents of stem canker infection (X) in a randomized blocks experiment comparing four
hnes of soybeans (5).

Lines

A B C D Totals
Blocks X Y X Y X Y X y X y

I 19.3 21.3 10.1 28.3 4.3 26.7 14.0 25.1 47.7 101.4
2 29.2 19.7 34.7 20.7 48.1 14.7 30.2 20.1 142.3 75.2
3 1.0 28.7 14.0 26.0 6.3 29.0 7.2 24.9 28.5 108.6
4 6.4 27.3 5.6 34.1 6.7 29.0 8.9 29.8 27.6 120.2

Totals 55.9 97.0 64.4 109.1 65.5 99.4 60.3 99.9 246.1 405.4

By looking at some plots with unusually high and unusually low X. note that there seems a
definite negative relation between Yand X Before removing this source of error by co-
variance, check that the lines do not differ in the amounts ')f infection. The analysis of
sums of squares and proou.;;ts is as follows:

df. I:x' I:xy I:y'


Blocks 3 2.239.3 -748.0 272.9
Treatments 3 t4.l 10.2 21.2
Error 9 427.0 -145.7 66.0

T+E 12 44!.l -135.5 87.2

(i) Perform the F-test of the adjusted means.


(ii) Find the adjusted means and test the differences among them.
(iii) Estimate the efficiem.:y of the adjostments. Ans. (i) F= 4.79·: df_"= 3, S:1,tI)'
A, 23.77; B, 27.52; C, 25.19; D, 24,87. By the LSD test, B significantly exceed-sA and D.
(iii) 3.56. Strictly, a slight correction to this figure should be made for the reduction in d.f.
from 9 to 8.

14.S-Interprefation of adjusted means in covariance. The most


straightforward use of covariance has been"illll strated by the preceding
examples. In these, the covariate X is a measure of the responsiveness of
the experimental unit, either directly (as with the leprosy bacilli) or in-
directly (as with number of plants). The adjusted means are regarded as
better estimates of the treatment effects than the unadjusted means be-
cause one of the sources of experimental error has been removed by the
adjuslments.
Interpretation of adjusted means is usually more difficult when both
Yand X show differences between treatments, or between groups in an
observational study. As mentioned in section 14.1. adjusted means are
. sometimes calculated in this situation either in order to throw light on the
way in which the treatments produce their effects or to remove a source of
bias in the comparison of Y between groups. The computations remain
unchanged, except that the use of the effective Error mean square
430 Chapter '4: Analysis of Covariance

is not recommended for finding an approximation to the variance of the


difference between two adjusted means. Instead, use the correct formula:

SD
, ,{2 + -'-'--::----'--'-
=!- S,. x -
Eu
n
Xi')'}
(Xi' -

The reason is that when the X's differ from treatment to treatment, the
term (Xi' - Xi')' can be large and can vary materially from one pair of
means to another, so that SD' is no longer approximately constant.
As regards interpretation, the following points should be kept in
mind. If the X's vary widely between treatments or groups, the adjust-
ment involves an element of extrapolation. To cite an extreme instance,
suppose that one group of men have ages (X) in the forties, with mean
about 45, while a second group are in their fifties with mean about 55.
In the adjusted means, the two groups are being compared at mean age
SO, although neither group may I)ave any men at this specific age. In using
the adjustment, we are assuming that the linear relation between Yand X
holds somewhat beyond the limits of each sample. In this situation the
value of SD' becomes large, because the term (Xi' - X}.)' is large. The
formula is warning us that the adjustments have a high element of uncer-
tainty. It follows that the comparison of adjusted means has low pre-
cision. Finding that F- or I-tests of the adjusted means show no signifi-
cance: we may reach the conclusion that 'The differences in Y can be
explained as a consequence of the differences in X," when a sounder
interpretation is that the adjusted differences are so imprecise that only
very large effects could have been defected. A safeguard is to compute
confidence limits for some of the adjusted differences: if the F-test alone
is made, this point can easily be overlooked.
Secondly, if X is subject to substantial errors of measurement, the
adjustment removes only part of any difference between the Y means that
is due to differences in the X means. Under the simplest mathematical
model, the fraction removed may be shown to be ux'/(ux' + "i), where
"/ is the variance of the errors of measurement of X. This point could
arise in an example mentioned in section 14.1. in which covariance was
suggested for examining whether differences produced by soil fumigants
on spring oats (Y) could be explained as a reflection of the effects of these
treatments on the numbers of nematode cysts (X). The nematode cysts
are counted by taking a number of small soil samples from each plot and
sifting each sample carefully by some process. The estimate of X on each
plot is therefore subject to a sampling error and perhaps also to an error
caused by failure to detect some of the cysts. Because of these errors,some
differences might remain among the adjusted Y means, leading to an
erroneOllS inference that the differences in yield could 1101 be fully ex·
431
plained by the effects of the treatments on the nematodes. Simi-
larly, in observational studies the adjustment removes only a fraction
a/I(a x' + a/) of a bias due to a linear relation between Yand X. In-
cidentally, the errors of measurement d do not vitiate the use of covariance
in increasing the precision of the Y comparisons in randomized experi-
ments, provided that Y has a linear regression on the measurement
X' = X + d. However, as might be expected, they make the adjustments
less effective, bOfOause the correlation p' between Y and X' = X + d is less
than the correlation p between Y and X, so that the residual error variance
a/(1 - p'2) is larger.
Finally, the meaning of the adjusted values is often hard to grasp,
especially if the reasons for the relation between Y and X are not well
known. As an illustration, table 14.5.1 shows the average 1964 expendi-
tures Y per attending pupil for schools in the states in each of five regions
of the U.S. (6). These are simple averages of the values for the individual
states in the region. Also shown are corresponding averages of 1963 per
capita incomes X in each region. In an analysis of variance into Between
Regions and Between States Within Regions, the differences between
regions are significant both for the expenditure figures and the per capita
incomes. Further, the regions faU in the same order for expenditures as
for incomes.
TABLE 14.1.1
1964 SCHOOL EXPENDITURES PER. ATTENDING PuPIL (y) AND 1963 PER CAPITA
JNCOMES (X) IN FIVE REoJONS OF THE U.S.

Mountain North South South


East and Pacific Central Atlantic Central

Number of states 8 II 12 9 8

(dollars)

Expenditures 542 SOO 479 3.99 111


Per capita incomes I 2,600 2,410 2,170 2,110 1,780

"
It seems natural to ask: Would the differences in expenditures dis-
appear after allowing for the relation between expenditure and income?
The within-region regression appears to be linear, and the values of b do
not differ significantly from region to region. The average b is 0,140
($14 in expenditure for each additional $100 of income), The adjusted
means for expenditure, adjusted to the overall average income of $2,306,
are as follows:
~~~F=====~=============~

E. M.P. N.C. s .... S.c.


(Dollars)
_____ ~L__
501_ _ _ 485 470_ _ _ _
_____ 398
_ _ __ _ _
432 Chop,., 14: Analy,is of Covoriance
The differences between regions have now shrunk considerably, al-
though still significant, and the regions remain in the same order except
that the South Central region is no longer lowest On reflection, however,
these adjusted figures seem hypothetical rather than concrete. The figure
of $409 for the South Cenlral region cannot be considered an estimate of
the amount that this region would spend per pupil if its per capita income
were to increase rapidly. perhaps through greater industrialization. from
$1,780 to $2.306. In fact. if we were Irying 10 estimale Ihis amount, a
study of the Between Years regression of expenditure on income for in~
dividual slates would be more relevant. Similarly, a conclusion that "the
differences in expenditures cannot be ascribed to differences in per capita
income" is likely to be misunderstood by a non-technical reader. For a
good discussion of other complications in interpretation, see (4).

14.6-Comparisoll of regression lines. Frequently, the relation be-


tween Yand X is studied in samples obtained hy different investigators,
or in different environments, or at different times. In summarizing these
results, the question naturally arises: can the regression lines be regarded
as the same? If not, in what respects do they differ? A numerical
example provides an introduction to the handling of these questions. The
example has only two samples, but the techniques extend naturally to
more than two samples.
In a survey to examine relationships between the nutrition and the
health of women in the Middle West (7), the concentration of cholesterol
in the blood serum was determined on 56 randomly selected subjects in
Iowa'and 130 in Nebraska. In table 14.6.1 are subsamples from the sur-
vey data. Figure 14.6.1 shows graphs of the data from each state. The
figure gives an impression of linearity of the regression of cholesterol
concentration on age, which will be assumed in this· discussion.
The purpose is to examine whether the linear regressions of choles-
terol on age are the same in Iowa and Nebraska. They may differ in
slope, in elevation, or in the residual variances G y .;/. The most con-
venient approach is to compare the residual variances first, then the slopes,
and lastly the elevations. In terms of the model, we have
Y'j = ~, + P'x;; + 'Ii;
where i = I, 2 denotes the two states. We first compare the residual
variances ",2 and (J,', next p, and P2' and finally the elevations of the
lines, ct 1 and Ctz'
The computations begin by recording separately the Within sum of
squares and products for each state, as shown in table 14.6.2 on lines 1 and
2. The next step is to find the residual S.S. from regression for each state,
as on the right in lines I and 2. The Residual mean squares, 2,392 and
1,581, are compared by the two-tailed F-test (section 2.9) or, with more
than two samples, by Bartlett's test (section 10.21). If heterogeneous
variances were evident, this might be pertinent information in itself. In
433
T A.BLE 14.6.1
Am: AND CONCENTRATION Of CHOLESTEROL (MG.lIOO ML.) IN THE BLOOD SER.UM O}'
IOWA AND NEBRASKA WOMEN
~~~~~=-r-====~~~=====

Iowa. n = 11 I Nebrask.a." = 19
-.----------r--:---====-- .----------
Age Cholesterol I' Age Cholesterol A.ge Cholesterol
X y X Y X Y

46 181 18 137 30 140


52 228 44 173 47 196
39 182 33 177 58 262
65 249 78 241 70 261
54 259 51 225 67 356
33 201 43 223 31 159
49 121 44 190 21 191
76 339 58 257 56 197
71 224 63 337
41 112 19 189
58 189 42 214

Sum 584 2,285 873 4,125

x, = 53.1 Y, = 207.7 f._217.1

Iowa

:EX' = 32,834 :EXY = 127,235 :E y' = 515,355


C: 31,005 121,313 474,657

~Xl = 1,829 :Exy = 5,922 :EY'.= 40,698

Nebraska

:EX' = 45,677 :EXY = 203,559 :E Y' = 957,785


C: 40,112 189,533 895,559

l:.x 2 = 5,565 :Exy = 14,026 :Ey' = 6~,226

Total ..n = 30

:EX = 1.457,X T = 48.6 :EX' = 78,511 l;XY = 330,794 l;Y' - 1,473,140


:E Y = 6,410, YT = 213.7 C.: 70,762 -._ 311,312 1,369,603

tx 2 = 7,749 .Exy - 19,482 l;y' = 103,537

this example, F = 1.51, with 9 and 17 df,givinga Pvalue greater than 0.40
in a two-tailed test. The mean squares show no sign of a real differ.ence.
Assuming homogeneity of residual variances, we now compare the
two slopes or regression coefficients, 3.24 for Iowa and 2.52 for Nebraska.
A look at the scalters of the points about the individual regression lines
in figure 14.6.1 suggests that the differences in slope may be attributable
to sampling variation. To make the test (table 14.6.2), add the df and
434 Chopter 14: Analysis of Covariance

400

• lowo

350 + ~~~kcI

eCiI.y" +
eli,._.J,.1 M

Z50
E
j
.5

't
o ~--~ro~-----30~------4~0-------'O------~60~----~7~0------~60--x

Aqt (yeorsl

Flo. 14.6.I-Graph of 11 pairs of Iowa data and 19 pairs from Nebraska. Age is X
and concentration of cholesterol. Y

S.S. for the deviations from the individual regr~ssion. recording these sums
in line 3. The mean square. 1.862. is the residual mean square obtained
when separate regression lines are fitted in each state. Secondly, in line
4 we add the sums of squares and products. obtaining the pooled slope,
2.70, and the S.S., 49,107, representing deviations from a model in which
a single pooled slope is fitted. The difference. 49, I 07 - 48,399 ~ 708
(line 5), with 1 dj:, measures the contribution of the difference between
the two regression coefficients to the sum of squares of deviations. If
there were k coefficients, this difference would have (k - 1) dj: The cor-
responding mean square is compared with the Within States mean square
435
TABLE 14.6.2
CoMPARISON OF RBoRESSION UN£§. CHOLfSTEROL DATA

Reg. Deviations From Regression


df. :tx' :txy :ty' Coef. df. 5.S. M.S.

L Within
I Iowa 10 1,829 5,922 40,698 3.24 9 21,524 2,392
2 Nebraska 18 5,565 14,026 62,226 2.52 17 26,875 1,581

3 26 48,399 1,862

4' Pooled, W 28 7,394 19,948 102,924 2.70 27 49,107 1.819

5 Difference between slopes I 708 708

6 Between, 8 I 355 -466 613

7 W+B 29 7,149 19,482 103,531 28 54,557

i
-~
I. Between adjusted means I 5,450 5,450

Comparison or.lopes: F= 108/1,862 = 0.38 (df. = 1,26) N.S.


ComparisOn of elevations: F - S,45()/J.819 = 3.00(dj. = 1,27) N.S.

1,862, by theeF-test. In these data, F = 708/1,862 = 0,38, df = I, 26,


supporting the assumption that the slopes do not differ.
Algebraically, the difference 708 in the sum of squares may be shown
to be L,L,(b, - b,)'/(L, + L,), where L" L, are the values of LX' for
the two states. With more than two states, the difference is Lw,(b, - 0)'
where w, = IlL, and 5 is the pooled slope, 1: w,b,/1:w,. The sum of
squares of deviations of the b's is a weighted sum, because the variances of
the b" namely (1,./11:" depend on the values of LX'.
Ifthe sample regressions were found to differ significanily, this might
end the investigation. Interpretation would involve the question: Why?
The final question about the elevations of the population regression lines
usually has little meaning unless the lines are parallel.
Assuming parallel lines and homogeneous variance, we write the
model as
Y;j = IX, + PXij + Btj'
where; = I, 2, denotes the state. It remains to test the null hypothesis
IX, = IX,. Thel~stsquaresestimatesofIX, andIX,are~, = Y,-bX, and
~, = Y, - bX,. Hence, the test of this Ho is identical to the test of the
H 0 that the adjusted means of the Y's are the same in the two states.
This is, of course, the F-test of the difference between adjusted means
that was made in section 14.3. It is made in the usual way in line 4 to 8
in table 14.6.2. Line 4 gives the Pooled Within States sums of squares and
products, while line 6 shows tbe Between States sums of squares and
products. In line 7 these are combined, just as we combined Error and
436 Chapt., 14: Analysis of CovariCIrIC.
Treatments in section 14.3. A Deviations S.S., 54,557, is obtained from
line 7 and the Deviations S.S. in line 4 is subtracted to give 5,450, the S.S.
Between adjusted means. We find F = 3.00, df. 1,27, P about 0.10. In
the original survey the difference was smaller than in these subsamples.
The investigators felt justified in combining the two states for further
examination of the relation between age and cholesterol.

14,7-Comparisoo of the "Between aasses" and the "Within Classes"


regressions. Continuing the theme of section 14.6, we sometimes need
to compare the Between Classes regression and the Within Classes regres-
sion in the same study. In physiology or biochemistry, for instance, Y
and X are measurements made on patients or laboratory animals. Often,
the number of subject,s is limited, but several measurements of Yand X
have been made on each subject. The Between Subjects regression may
be the one of primary interest. The objective of the comparison is
to see whether the Within and Between regressions appear to estimate
the same quantities. If so, they can be combined to give a better estimate
of the Between Subjects relationship.
The simplest model that might apply is as follows:

(14.7.1)
where i denotes the class (subject). In this model the same regression
line hold.s throughout the data. The best combined estimates of a and fJ
are obtained by treating the data as a single sample, estimating a. and fJ
from the Total line in the analysis of variance.
Two consequences of this model are important: (I) The Between
and Within lines furnish independent estimates of fJ: call these hi and h,
respectively. (2) The residual mean squares Sl2 and S2 from the regres-
sions in the Between and Within lines are both unbiased estimates of ,,2,
the variance of the "'I'
To test whether the same re~ression holds throughout, we therefore
compare hi and hand SI 2 and s. Sometimes, b l and h agree well, but
3. 2 is found to be much larger than S2. One explanation is that all the
'YjJ for a subject are affected by an additional component of variation
d~ independent of the eil. This model is written
(14.7.2)

If the subjects are a random sample from .some population of subjects,


the d, are usually regarded as a random variable from subject to subject
with population mean zero and variance a.'. Under this model, b. and
b are still unbiased estimates of fJ, but with m pairs of observations per
subject, s.! is an unbiased estimate of at' = (a 2 + ma.'), while 3 2 con-
tinues to estimate ,,2. Since the method of comparing h and b. and- the
best way !:If combining them depend on whether the component d, is
~ we,suggest that S2 and s.' be compared first by an F-test.
437
The calculations are illustrated by records from ten female leprosy
patients. The data are scores representing the abundance ofleprosy bacilli
at four sites on the body. the Xij being initial scores and the Y,j scores after
48 weeks of a standard treatment. Thus m = 4, n = 10. (This example
Is purely for illustration. This regression would probably not be of interest
ir itself; further, records from many additionai patients were available so
that a Between Patients regression could be satisfactorily estimated di-
rectly.) Table 14.7.1 shows the initial computations.

TABLE 14.7.1
ScORES FOR LEPROSY BACILLI AT FOUR SITES ON T.l:N PATIENts

df. 1:x' I:x)' 1:)'2 Reg. Coer.

Between patients 9 28.00 26.00 38.23 b, = 0.939


Within patients JO 26.00 13.00 38.75 b = 0.500

Tot:.! 39 54.00 39.00 76.98

Reduction Deviations From Regression


(1:xy)' /1:x' df. 5.5. M.S.

Between patients 24.14 8 14.09 <j = 1.761


Within patients 6.50 29 32.25 s=1.I12

After performing the usual analysis of sums of squares and products,


the reduction in sum of squares due to regression is computed separately
for the Between and Within lines (lower half of table 14.7.1). From these,
the Deviations SS. and M.S. are ohtained. The F ratio is S,2/S2
= 1.761/1.112 = 1.58 with 8 and 29 df, corresponding to a P leyel of
about 0.20.
Although F falls short of significance, the investigator may decide to
assume that <1,2 is greater than <12, and thus to retain the model (14.7.2),
particularly since the Between Patients mean square is significant for both
Y and X individually. To compare b, and b under this model, note that
the estimated variances of b, and bare s,';l:, and s2;l:, where l:, and l: are
the values of l:x 2 for Between Patients and Within Patients, respectivaly.
From table 14.7.1 the ratio of (b, - b) to its standard error is therefore

, b, - b 0.939 - 0.500 0.439 0.439


r =fFf=
-
1.761
--+--
l.l12 ,/0.0629 + 0.0428 0.325
l: 28.00 26.00
= 1.35
which is clearly non-significant. The quantity t is not distributed as t.
but its significance level, if needed, is found by the approximate method
jn section 4.14. Since 5.' has 8 df and 52 has 29 d,f, find the 5~~ sig-
438 C/,opIer 14: Analysis of Covoriance
nificanee levels of I for 8 df. and 29 df., namely 2.306 and 2.045. Form a
weighted mean of these two values, with weights S,2~, = 0.0629 and
S2 ~ = 0.0428. This mean is 2.20, the required 5% significance level of t'.
It remains to find a combined estimate of Pfrom b, and b. In coI!I-
bining two independent estimates tbat are of unequal precision, a general
rule is to weight each estimate inversely as its variance. In this example,
as is usually the case in practice, we have only estimates s ,2/'f., = 0.0629
and S2~ = 0.0428 of the variances of b, and b. If s/ and S2 both have
at least 8 df., weight b, and b inversely as tbeir estimated variance (8).
The weights are w, = 1/0.0629 = 15.9, W = 1/0.0428 = 23.4, giving

p = (15.9)(0.939~;3(23.4)(0.5OO) = 0.678
If W = w, + w = 39.3, the standard error of Pmay be taken as (8)
1 1 + 4w,w (I, + J) = 0.171
JW W2 fJ '
where f" fare the df. in s,', S2. The second term above is an allowance
due to Meier (9) for sampling errors in the weights.
We now show how to complete the analysis if a/ = (12. Form a
pooled estimate of (7' from s,' and s'. This is .1 2 = 46.34/37 = 1.252 with
37 df. The estimated variance of (b, - b) is

fl2 + a' = a' (L, + 'f.) = (1.252)(54.00) = 0.0929


L, L 'f., 'f. (28.00)(26.00)
Hence, (h, - b) is tested by the ordinary t-test,
0.4386 0.4386
t= = ~~ = 1.44 (37 af.)
. JO.0929. 0.305

The pooled estimate of Pis simply the estimate Lxyl'f.x' from the Total
line iIi the analysis of variance. This is 39.00/54.00 = 0.722, with standard
error JW/('f., + 'f.)} = J(1.252/54.00) = 0.152.
Methods for extending this analysis to mUltiple regression are pre-
sented in (10).
14.8-Multiple co.arlance. With two or more independent variables
there is no change in the theory beyond the addition of extra terms in X.
The method is illustrated for a one-way classification by the average daily
gains of pigs in table 14.8.1. Presumably these are predicted at least
partly by the ages and weights at which the pigs were started in the experi-
ment, which compared four feeds.
This experiment is an example of a technique in experimental design
known as balancing. The assignment of pigs to the four treatments was
439
not made by strict randomization. Instead, pigs were allotted so that
the means of the four lots agreed closely in both X, and X,. An indication
of the extent of the balancing can be seen by calculating the F-ratios for
Treatments/Error from the analyses of variance of X, and X" given
under table 14.8.1. These F's are 0.50 for X, and 0.47 for X" both well
below I.
The idea is that if X, and X, are linearly related to Y, this balancing
produces a more accurate comparison among the Y means. One com-
plication is that since the variance within treatments is greater than that
between treatments for X, and X" the same happens to some extent for
Y. Consequently, in the analysis of variance of Y the Error mean square
is an overestimate and the F-test of Y gives too few significant results.
However. if the covariance model holds, the analysis of covariance will
give an unbiased estimate of error and a correct F-test for the adjusted
means of Y. The situation is interesting· in that, with balancing, the reaSon
for using covariance is to obtain a proper estimate of error rather than to
adjust the Y means. If perfect balancing were achieved. the adjusted Y
means would be the same as the unadjusted means.
The first step is to calculate the six sums of squares and products.
shown under table 14.8.1. Next, b, and b, are estimated from the Error
lines, the normal equations being
4,548.20b, + 2,877.4Ob, = 5.6230
2,877.40b, + 4.876.90b, = 26.2190
The elj inverse multipliers are
Cll = 0.0003508. ('" = -0.0002070. c" = 0.0003272
These give
b, = -0.0034542 b, = 0.0074142
Reduction in S.S. = (-0.0034542)(5.6230) + (0.0074142)(26.2190)
= 0.1750 "-
Deviations S.S. = 0.8452 - 0.1750 = 0.6702' (34df): s' = 0.0197
The standard errors of b, and b, are
Sb, = ,,/(s'(',,) = 0.00263 : So, = ,,/(s'c,,) = 0.00254
It follows that b, is definitely significant but b, is not. In practice. we
might drop X, (age) at this stage and continue the analysis using the regres-
sion of Y on X, alone. But for illustration we shall adjust for both
variables.
If an F-test of the adjusted means is wanted. make a new calculation
of b, and b, from the Treatments plus Error lines. in this case the Total
line. The results are b, = -0.0032903. b, = 0.0074093. Deviations
S.S. = 0.8415 (37 dfl. The F-test is made in table 14.8.2.
The adjusted Y means are computed as follows. In our notation.
r;.X 1i •
....0 Chapt... 14: Analysis o{ CovcrriOllc.

TABLE 14.8.1
INITIAL AGE (Xj).INITIAL WEIGHT (X t )• ... NO RATE OF GAIN (Y) Of 40 PIOS
(Four treatments tn lots of equal size)

Treatment I Treatment 2

Initial Weight. Initial Weight.


Age. XI X, Gain. Y Agc. XI X, Gain. Y
(do,s) Jpowrd.s) (pounds (do,s) (POunds) (pounds
P<' doy) P<' doy)
78 61 1.40 78 74 1.61
90 59 1.79 99 75 1.31
94 76 1.72 80 64 1.12
71 SO 1.47 75 41 1.35
99 61 1.26 94 62 1.29
80 54 1.28 9J 42 1.24
83 57 1.34 75 52 1.29
75 45 U5 63 43 1.43
62 41 1.57 62 SO 1.29
I 67 40 1.26 61 40 1.26
Sums 799 S44 14.64 784 5SO 13.19
Means 19.9 54.4 1.46 18.4 55.0 1.32

Treatment J Treatment 4

78 80 1.67 77 62 1.40
83 61 1.41 71 55 1.47
79 62 1.73 78 62 1.37
70 47 1.23 70 43 1.15
85 59 1.49 95 57 1.22
83 42 1.22 96 51 1.41
71 41 1.39 71 41 1.31
66 42 1.39 63 40 1.27
67 40 1.56 62 45 1.22
61 40 1.36 67 39 1.36

Sums 149 520 14.45 7SO 495 13.25


Means 74.9 52.0 1.44 75.0 49.5 1.32

Sums of Squares and Products

df. txf IXlx] :txi


Treatments 3 181.70 160:15 189.08
Error 36 4.548.20 2.877.40 4.816.90

Total
-39 ---
4.735.90
---
3.037.SS
---
5.065.98

df. :t.T,y .I.Tz)' 1:y'


Treatments J 1.lOOS 1.3218 0.1776
Error 16 S.623O 26.2190 0.8452
- -- -- --
Total 39 6.9235 27.5408 1.0228
441
TABLE 14.8.2
ANALYSIS OF COVARIANCE OF PIG GAINS. DEVIATIONS FROM ~EGRESSION

Source of Variation Decrees of Freedom Sum of Squares Mean Square


Total 37 0.8415
Error 34 0.6702 0.0197
For testing adjusted
Treatment means 3 0.1713 0.0571'

F _ 0.0571/0.0197 ~ 2.90'. dJ. - 3.34

X 2' denote the means of 1', X t , and X 2 for the ith treatment while Xl and
X 2 denote the overall means of X, and X2 •

Treatment 2 3 4 Multiplier

Y; 1.46 1.32 1.<14 1.32 1


(XII-Xd +2.9 + 1.4 -2.t -2.0 0.00345 = -b 1
(X" - X,) + 1.7 +2.3 -0.7 -3.2 -0.00741 - -b,

'f..i , L46 1.31 1.44 1.34

Thus, for treatment 4,


Y...j • = Y. - bt(X .. - XI) - b2(X 2• - X2)
= 1.32 + 0.OO345( -2.0) - 0.OO741( -3.2) = 1.34
There is little change from unadjusted to adjusted means because of
the balancing.
The estimated variance of the difference between the adjusted m_
of the ith and jlh treatments is
s2[21n + c,,(Xt; _XI})2 + 2c12 (X" - Xjj)(X u - X2) + C2'(X" - X2 iJ
As with covariance on a single X-variable (section 14.2), an average
error variance can be used for comparisons among the adjusted means if
there are at least 20 d.f for Error. The effective Error mean square per
observation is
S,2 = s2[1 + CUtl1 1- 2C12t12 + cutu]
where I II, '22 and '12 are the Treatmellts mean squares and mean product
This equation is the extension of(l4.2.3) to two X-variables. In these data
5'2 = 0.0197[1 + ((0.3508)(62.6) - 2(0.2070)(53.4) + (0.3272)(63.0)}iIO J l
= (0.0197)(1.020) = 0.0201.
For instance, to find 95% confidence limits for the difference between the
adjusted means of treatments 1 and 2, we have
442 Cltapfer 14: Analysis of Covari""••
D = 1.46 - 1.31 = 0.15 pounds per day
so'" -/(2s"(IO) =-/0.00402 = 0.0634
The difference 0.15 pounds between Treatments I and 2 is the greatest
of the six differences between pairs of treatments. It is the only difference
that is significant by the LSD test. By the Newman-Keuls test, none of the
differences is significant, the required difference for 5% significance be-
tween the highest and the loweSt means being 0.17 pounds. This is one of
those occasional examples in which although F is significant (just on the
5% level), none of the individual differences between pairs is clearly
significant.
These data also illustrate the point that the regression of Y on X,
alone may be quite different from the same regression, Y on X,. when
another X variable is included in the model-even the signs may be
opposite. Consider the regression of Yon X, (age) in the pig data. Using
Totals, the regression coefficient is

bYl = 6.9235/4,735.90 = 0.00146 Ib./day/day of age

Compare this with b Yl ., = -0.00329 calculated on p. 439, also for Total.


Why should average daily gain increase with age in the first case and
decrease with age in the second?
TABLE 14.8.3
DATA ON 40 PIGS CLASSIFIED 'Y INmAL WEIGHT

Initial Number
Weight of Piss Initial Age and Average Daily Gain Mean
---
62 63(2)' 66 67(5) 70 71 &3 91 69.5
39-44 IJ
1.57 1.35 1.39 1.36 1.15 1.31 1.22 1.24 1.34

5
62 70 71 75(2) 70.6
45-49
1.22 1.23 1.39 1.45 1.35

62 71 75 &0 96 76.&
SO-S4 5
1.29 1.47 1.29 1.2& 1.48 1.36

71 ~3 85 90 95 84.8
55-59 5 1.22
1.47 1.34 1.49 1.79 1.46

77 78(2) 79 &0 83 94 99 83.5


6O-M 8
1.40 1.38 1.73 1.12 1.41 1.29 1.26 1.27

74-80 4
78(2) 94 99 87.2
1.64 1.72 1.31 1.58

Total 40 17.05
1.388

• Number of pigs of this age.


The first regression is an overall effect, ignoring initial weight. In
this sample there was a slight tendency for the initially older pigs to gain
faster. But among pigs of the same initial weight (initial weight held
constant) the older pigs tended to gain more slowly.
These facts may be observed in table 14.8.3. The right-hand column
shows that both initial age and rate of gain increase with initial weight;
they are positively associated because of their common association with
initial weight. But within the rows of the table, where initial weight
doesn't change much, there is the opposite tendency. The older pigs tend
to gain more slowly. Table 14.8.4 gives the within-weight regressions.
In the last line is the Pooled regression, - 0.00335. This average differs
only slightly from the average. b Yl . , = - 0.00329. estimating the same
effect, the regression of average daily gain on initial age in a population of
pigs all having the same initial weight.
TABLE 14.M
ANALYSIS Of COVARIANCE 11"/ WEIGHT CLASSES Of PIGS

Sums of Squares and Products


Weight , Degrees of Regression of
Class Freedom 1:.t 1 2 l:xly l:y' \ YonXl
i
39-44 12 831.2308 -6.1885 0.1917 , -0.007445
45-49
50-54
4
4
i 113.2000
634.8000
2.0860
2.5720
0.0729
0.0427 ! 0.018428
0.004052
\ ,
55-59 4 324.8000 -0.6480 0.1819 -0.001995
60-64 7 486.0000 -3.6700 0.2140 -0.007551
7~0 3 I 354.7500 -3.3375 0.1015 -0.009408
I
Pooled 34 2.744.7808 -9.1860 0.8047 -0.003347

~
_--
14.9-Multiple co,anaace in a 2-way table. As illustration we
select data from an experiment (II, 12) carried out in Britain from 1932
to 1937. The objecrive was to learn how well the wheat crop could be
forecast from measurements on a sample of growing plants. During the
growing season a uniform series of measurements were taken at a number
of places throughout the country. The data 'ih table 14.9.1 are for three
seasons at each of six places and are the means of two standard varieties.
In the early stages of the experiment it appeared that most of the available
information was contained in two variables, shoot height at the time when
ears emerge. Xl. and plant numbers at tillering, X,.
For an initial examination of relationships, the data on Y, X" and
X, should be free of the place and season effects. Consequently, the re-
gression is calculated from the Error or Places x Seasons Interactions line.
If. however, the regression is to be successful for routine use in predicting
yields. it should also predict the differences in yield between seasons. It
might even predict the differences in yield between places, though this is
too much to expect unless the X-variables can somehow express the
...... Chopter 14: Analysis 01 Covarianc.
effects of differences in soil types and soil fertilities between stations.
Consequently, in data of this type, there is interest in comparing the
Between Seasons and Between Places regressions with the Error regres-
sion, though we shall not pursue this aspect of the analysis.

TABLE 14.9.1
HEIGHTS OF SHOOTS AT EAIl EMERGENCE (Xl)' NUMBEIl OF PLANTS AT TILLElUNQ (Xl)'
AND YIELD (y) OF W~T IN GREAT BRlTAlN
(XI' inches,; Xl' number per foot; Y, cwt. per acre)

Place

Seale Rotham- New· Bog- Sprows- Plump- Year


Year Variate Hayne sted port hall ton ton Sums

1933 X, 25.6 25.4 30.8 33.0 28.5 28.0 171.3


X, 14.9 13.3 4.6 14.7 12.8 7.5 67.8
Y 19.0 22.2 35.3 32.8 25.3 35.8 170.4

1934 X, 25.4 28.3 35.3 32.4 25.9 24.2 171.5


X, 7.2 9.5 6.8 9.7 9.2 7.5 49.9
Y 32.4 32.2 43.7 35.7 28.3 35.2 207.5

1935 X, 27.9 34.4 32.5 27.5 23.7 32.9 178.9


X, 18.6 22.2 10.0 17.6 14.4 7.9 90.7
y 26.2 34.7 40.0 29.6 20.6 47.2 198.3

Place X, 78.9 88.1 98.6 92.9 78.1 85.1 I 521.7


Sums X, 40.7 45.0 21.4 42.0 36.4 22.9 _I 208.4
Y 77.6 89.1 119.0 98.1 74.2 118.2 576.2

df. I:x\ 1 Ix t x 1 :EX).l

Places 5 106.34 - 47.06 171.46


Seasons 2 , 6.26 26.24 139.41
Error 10 117.93 20.17 74.20

Total 17 230.53 - 0.65 385.07

df IX1Y Ex,"" ty'


Places 5 190.83 -257.03 629.22
Seasons 2 8.41 - 22.26 124.42
Error 10 142.01 - 21.46 228.66

Total 17 341.25 -300.75 982.30

The results obtained from the Error line are: h, = 1.3148,


b, = -0.6466, l:P' = 200.59, l:d' = 28.07 (8 df). These statistics, with
some from the table, lead to the follOwing information:
I. Freed from season and place effects, height of shoots and number
of plants together account for
445
1:92 /1:y2 = 200.59/228.66 = SS%
of the Error sum of squares for yield.
2. The predictive values of the two independent variables are indi-
cated by the following analysis of :Ey':

Source Degrees of Freedom Sum of Squares Mean Square

Regression On X. and X2 2 200.59


RcgreSSion on XI alone 171.01
Xz after Xl 29.58 29,58*

1Regression on Xl alone
Xl after X2
Deviations 8
6.21
194.38
28.07
194.38··
3.51

While each X accounts for a significant reduction in 1:y', shoot


height is the more effective.
3. The Error regression equation is
f + 1.3148 XI - 0.6466 X2
= 1.393

Substituting each pair of X, the values of f and Y - f are calculated for


each place in each season and entered in table 14.9.2.

TAPLE 14.9.2
ACTUAL AND EsnNATED YIELDS OF WHEAT

h
1933 1934 1935

Place f Y-f r f Y- f Y f Y-f Sum


Seale Hayne 19.0 25.4 -6.4 32.4 3O.l 2.3 26.2 26.0 0.2 -3.9
Rothamsted 22.2 26.2 -4.0 32.2 32.5 -0.3 34.7 32.3 2.4 -1.9
Newport 35.3 38.9 -3.6 43.7 43.4 0.3 4O-:tI 37.7 2.3 -1.0
Boghall 32.8 35.3 -2.5 35.7 37.7 -2.0 29.6 26.2 3.4 -1.1
Sprowston 25.3 30.6 -5.3 28.3 29.5 -1.2 20.§ 23.2 -2.6 -9.1
Plumpton 35.8 33.4 2.4 35.2 28.4 6.8 47.2 39.5 7.7 16.9

Sums -19.4 5.9 13.4 -0.1

It seems clear from table 14.9.2 that the regression has not been suc-
cessful in predicting the differences between seasons. There is a consistent
underestimation in 1933, which averaged 19.4/6 = 3.2 cwt./acre, and an
overestimation in 1935. If a test of significance of the difference between
the adjusted seasonal yields is needed, the procedure is the same as for the
F test of adjusted means in section 14.8. Add the sums of squares and
products for Seasons and Error in table 14.9. I. Recalculate the regression
from these figures, finding the deviations S.S .. 120.01 with 10 df. The
ditterence, 120m - 28.07 has 2 df., giving a mean square 45.97 for the
446 Chapter '4: Analysis of Covariance
differences between adjusted seasonal yields. The value of Fis 45.97/3.51
= 13.1" with 2 and 8 df

REFERENCES
t. R. A.. FISHER. Sln/istical Methods for Research Workers. §49.1. Oliver and Boyd,
Edmburgh (1941).
2. 0.1. FINNEY. Biometrics Bul .. 2: 53 (1946).
3. G. F. 5PRAGUF.. Iowa Agric. Ex.p. 5ta. data (1952),
4. H. f. SMITH. Biometrics, 13:282 (1957).
5. J. M. CRALl. Iowa Agric. Ex.p. Sta. data (1949).
6. U.S. Bureau of the Census, Statistical Abstract of the U.S., 86th ed. U.S. GPO,
Washington, D.C. (1965). .
7. P. P. SWANSON et 01. J. Geron/oloy.,!, 10:41 (1955).
8. w. G. COCHRAS. Biomf'lrics. 10:116 (1954).
9. P. MEIER. Biometrics. 9:59 (1953).
10. D. B. DUNCAN and M. WALSER. Biometrics, 22:26 (1966).
11. M.M.BARNARD. J.Agric.Sci .. 26:456(1936).
f2. F. YATES. J. MinisfryoIAgric.4.1:!56(f936).
13. O. M. SMITH and H. T. BEF.CUER, J. Pharm. and Exper. Therap., 136:47 (1962).
* CHAPTER FIFTEEN

Curvilinear regression

":IS.I-Introduction. Although linear regression is adequate for many


needs, some variables are not connected by so simple a relation. The dis·
covery of a precise description of the relation between two or more quan·
tities is one of the problems of curvefitting, known as curvilinear regression.
From this general view the fitting of the straight line is a special case, the
simplest and indeed the most useful.
The motives for fitting curves to non-linear data are various. Some-
times a good estimate of the dependent variable is wanted for any par-
ticular value of the independent. Thi. may involve the smoothing of
irregular data and the interpolation of estimated Y's for values of X not
contained in the observed series. Sometimes the objective is to test a law
relating the variables, such as a growth curve that has been proposed from
previous research or from mathematical analysis of the mechanism by
which the variables are connected. At other times the form of the rela-
tionship is of little interest; the end in view is merely the elimination of
inaccuracies which non-linearity of regression may introduce into a cor-
relation coefficient or an experimental error.
Figure 15.1.1 shows four common non-linear relations. Part (a) is
the compound intetest law or exponential groH'th.curve W = A(BX ), where
we have written W in place of our usual Y. If B = 1 + i, where i is the
annual rate of interest, W gives the amount to which a sum of money A
will rise if left at compound interest for X years. As we shall see, this
curve also represents the way in which some organisms grow at certain
stages. The curve shown in Part (a) has A = 1.
If B is less than I, this curve assumes the form shown in (b). It is
often called an expon~ntial decay curve, the value of W declining to zero
from its initial value A as X increases. The decay of emissions from a
radioactive element follows this curve.
The curve in (c) is W = A - Bpx, with 0 < p <!. This curve rises
from the value (A - B) when X = 0, and steadily approaches a maximum
value A, called the asymptote, as X becomes large. The curve goes by
various names. In agriculture it has been known as Mitscherlich's law,
from a German chemist (11) who used it to represent the relation between
447
...., CItapIer J5: CIIf'VI"'ar I.",....,
W W
5

2 O,S

5 X 5 X
(a) Exponential Growth Law Cb) Exponential Decay Law
W,. A(BX~" ACeCX ) W = ACB- x ) • ACe-ex)

3 5 X
Cc) Asymptotic ReQre1sion «(I) Logi's fic Growth Law
W" A-BCpX) • .A-BCe-ex) W = AI C1+ Bpx )
FIG.1-S.1.1-Four common Don-linear curves.

the yield W of a cr.o p (grown in pots) and the amount of fertilizer X added
to the soil in the pots. In chemistry it is sometimes called tbefirst-order
reaction curve. The name asymptotic regression is also used.
Curve (d), the logistic growth law, bas played a prominent part in the
study of human populations. This curve gives a remarkably good fit to
the growth of the U.S. popUlation, as measured in the decennial censuses,
from 1790 to 1940. .
In this chapter we shall illustrate the titting of three types of curve:
(1) certain non-linear curves, like those in (a) and (b), figure 15.1.1, which
can be reduced to straight lines by a transformation of the Wor the X
scale ; (2) tbe polynomial in X, which often serves as a good approxima-
tion; (3) non-linear curves, like (c) and (d), figure 15.1.1, requiring more
complex methods of fitting.
EXAMPLE 15.1.1 - The fit of the logi~tic curve of the U.S. Census populations (tx-
cluding Hawaii and Alaso) for tbe 150-yC8r period from 1790 to 1940 is an inteRStit1&
449
example. both of the striking accuracy of the fit, and of its equally striking failure when
extrapolated to give population forecasts for 1950 and 1960. The-curve, fitted by Pearl and
Reed (1), is .

IS4.00
w-- 1'+
..~~~~~~"""
(66.69)(10 o.",ox)

where X = 1 in 1790, and one unit in X represents 10 yurs, so that X = 16 in 1940. The
table below shows the actual census population. the estimated population from the logistic.
and the error of estimation.

Population Population
Year Actual Estimated A-E Year Actual Estimated A-E

1790 3.9 3.7 +0.2 IS80 50.2 50.2 0.0


1800 5.3 5.1 +0.2 1890 62.9 62.8 +0.1
ISIO 7.2 7.0 +0.2 1900 76.0 76.7 -0.7
IS20 9.6 9.5 +0.1 1910 92.0 91.4 +0.6
IS30 12.9 12.S +0.1 1920 105.7 106.1 -0.4
IS40 17.1 17.3 -0.2 1930 122.S 120.1 +2.1
IS50 23.2 23.0 +0.2 1940 131.4 132.S 1.4
1860 31.4 30.3 + 1.1 19SO ISO.7 143.8 +6.9
IS70 3S.6 39.3 -0.7 1960 17S.5 153.0 +25.5

Note how poor the 1950 and 1960 forecasts art. The forecast from the curve is that the
U.S. population will never e)(ceed 184 million; the actual 1966 population is already well
over 190 million. The postwar baby boom and improved health services are two of the
responsible factors.

15.2-The eXJMlnential growth curve. A characteristic of some of the


simpler growth phenomena is that the increase at any moment is propor-
tional to the size already attained. During one phase in the growth of a
culture of bacteria, the numbers of organisms follow such a law. The
relation is nicely illustrated by the dry weights of chick embryos at ages
6 to 16 days (2) recorded in table 15.2.1. The graph of the weights in
figure 15.2.1 ascends with greater rapidity as age increases, the regression
equation being of the form
W = (A)(B'),
where A and B are constants to be estimated. Applying logarithms to the
equation,
log W = log A + (log B)X
or Y = IX + {lX,
where Y = log W. IX = log A, and {f = log B. This means that if log W
instead of W is plotted against X. the graph will be linear. By the device
of using the logarithm instead of the quantity itself. the data are said to
be rectified.
The values of Y = log Ware set out in the last column of the table
and are plotted opposite X in 1he figure. The regression equation. com-
450 C/tapf.r 15: Curvilinear Rellrenion
TABLE 15.2.1
DRY WEIGHTS OF CHICK EMBRYOS FROM AGES 6 TO 16 DAYS.
TOGETHER WITH COM~ON LoGAIUTHMS

Common Logarithm.
Ages in Days Dry Weight. W of Weight
K (grams) y

6 0.029 -1.538·
7 0.052 -1.284
8 0.079 - 1.102
9 0.125 -0.903
10 0.181 -0.742
II 0.261 -0,583
12 0.425 -0.372
13 0.738 -0,132
1. 1.130 0,053
15 L882 0,275
16 2.812 0.449

• From the table of logarithms. one reads log 0.029 = log 2.9 - log 100 = 0.462
- 2 ~ -1.538,

puted in the familiar manner from the columns X and Y in the table, is
y ~ 0.1959X - 2.689
The regression line fits the data points with unusual fidelity, the correla-
tion between Y and X being 0.9992. The conclUSIOn is that the chick
embryos, as measured by dry weight, arc growing in accord with the
exponential law, the logarithm of the dry weight increasing at the esti-
mated uniform rate of 0.1959 per day.
Often, the objective is to learn whether the data follow the exponential
law. The graph of log Wagainst X helps in making an initial judgment on
this question, and may be sufficient to settle the point. If so, the use of
semi-logarithmic graph paper avoids the necessity for looking up the
logarithms of W. The horizontal rulings on this graph paper are drawn
to such a scale that the plotting of the original data results in a straight
line if the data follow the exponential growth law .. Semi-log paper can
be purchased at most stationery shops. If you require a more thorough
method of testing whether the relation between log Wand X is linear, see
the end of section 15.3.
For those who know some calculus, the law that the rate of increase
at any stage is proportional io the size already attained is described mathe-
matically by the equation

dW
dX ~ cW,

where c is thecanstant relative rate of increase. This equation leads to the


451
1.0

V
..
o 0 / 3.0
.,
~
r
t-O•s
III
'(
,I'
/ I
J
J
... S

I
CI
9-1.0 /' . l 't..o
</l
~

/
Q!

~lE-I.S
I:J
I." ~
8 1.1"
I l-
I:
-t.O
/ 1.0 W
'2
/ ~

,;'
/ 0.5

~
o
, , ., ,....-r.
10
I I I
15
I I I I o
't.O
AGE: IN DAV~

FlO. IS.2.i-Dry weights of chicle embryos at ages 6-16daYJ with fitted curves.
Uniform ""'Ie: W ~ O.OO2046(1.51)x
Logarilhmic scale: Y - O.1959X - 2,689

relation
log. W = log, A + eX,
or, (15,2, I)
W= ArK"
where e ~ 2.718 is the base o(the natural system o(logarithms, Relation
15.2.1 is exactly the same as our previous relation
log,o W = a + fJX
except that it is expressed in logs to base e instead of to base 10,
Since log. W = (log,o W)(log, 10) = 2.3026 log,o W, it follows that
e = 2.3026/l. For the chick embryos, the relative rate of growth is
452 Chapt.... IS: Curvilinear Regression
(2.3026)(0.1959) = 0.451 gm. per day per gm. It is clear that the relative
rate of growth can be computed frpm either common or natural logs.
To convert the equation log W = 0.1959X - 2.689 into the original
form, we have
w~ (0.00205)( 1.57)X
where 0.00205 = antilog( -2.689) = antilog(0.311 - 3) = 2.05/1.000
= 0.00205. Similarly, 1.57 = antilog (0.1959). In the exponential form.

W = (0.00205)eo. 451X
the exponent 0.451 being the relative rate.
Other relations that may b~ fitted by a simple transformation of the
W or the X variable are W = I/X, W = a + {J log X, and log W = a
+ {J log X The applicability of the proposed law shocld first be examined
graphically. Should the data appear to lie on a straight line in the relevant
transformed sca~e, proceed with th~ regression" computation. For the
last of the above relations, logarithmic paper is available, both vertical
and horizontal rulings being in the logarithmic scale.
The transformation of (i non-linear relation so that it becomes a
straight line .is a simple method of fitting, but it involves some assump-
tions that should be noted. For the exponential growth curve, we are
assuming that the population relation is of the form

Y = log W = a + {JX + e, (15.2.2)

where the residuals e. are independent. and have zero means and constant
variance. Further, if we apply the usual tests of significance to a and {J,
this involves the assumption that the e's are normally distributed. Some-
times it seems more realistic, from our knowledge of the nature of the
process or of the measurements, to assume that residuals are normal and
have constant variance in the original W scale. This means t1;1at we
postulate a population relation
" W = (A)(BX) +d (15.2.3)

where A, B n'ow sland for population parameters, and the residuals d


are %(0, ,,2).
If equation 15.2.3 holds. it may be shown that in equation 15.2.2 the
e's will not be normal, and their variances will change as X changes.
Given model 15.2.3, the efficient method of fitting is to estimate A and B
by minimizing
I:( W - ABX)'
taken over the sample values. This produces non-linear equations in A
and B that mu~t be solved by successive approximations. A general
method of fitting such equations is given in section 15.7.
453
EXAMPLE lS.2.1-J. W. Gowen and W. C. Price counted the number of lesions of
Aucuba mosaic virus developing after exposure to X-rays for various times (data made
available through courtesy of the investigators).

Minutes exposure 0 3 7.5 15 30 45 60


--~-------------------------
Count in hundreds 271 226 209 108 59 29 12

Plot the count as ordinate, then plot its logarithm. Derive the regression, Y = 2.432
- 0.02227 X, where Y is the logarithm of the count and X is minutes exposure.
EXAMPLE 15.2.2-Repeat the fitting of the last example using natural logarithms.
Verify the tact that the rate of deCrease of hundreds of lesions per minute per hundred is
(2.3026)(0.02227) ~ 0.05128.
EXAMPLE 15.2.3--Ifthe meaning of relative Tate isn't quite dear. try this approximate
method of computing it. The increase in weight of the chick embryo during the thirteenth·
day is 1.130 - 0,738 = 0.392 gram: that is, the average rate during this period is 0.392 gm.
per day. But the average weight during the same period is (1.130 + 0.738);'2 = 0.934 gm
The relative rate, 0f rate of increase of each gram, is therefore 0.392/0.934 = 0.42 gm. per
day per gm. This differs from the average obtained in the whole period from 6 to 16 days,
0.451. partly .because the average weight as well as the increase in weight in the thirteenth
day suffered some :;ampling variation. and partly because the correct relative rate is based
on weight and increase in weight at any instant of time, not on day averages.

\5.3-The second degree polynomial. Faced by non-linear regression,


one often has no knowledge of a theoretical equation to use. In many
instances the second degree polynomial,
f ~ a ~ bX + eX',
will be found to fit the data satisfactorily. The graph is a parabola whose
axis is vertical, but usually only small segments of such a parabola appear
in the process of fitting. Instead of rectifying the data a third variate is
added, the square of X. This introduces the methods of multiple resres-
sion. The calculations proceed exactly as in chapter 13, X and X' being
the two independent variates. It need only be remarked that .J X, log X,
or I/Xmight have been added instead of X' if the data had required it.
To illustrate the method and some of its applications, we present the
data on wheat yield and protein content (3) in'lable 15.3.1 and figure
15.3.1. The investigator wished to estimate the protein content for various
yields. We shall also test the significance of the departure from linearity.
The second column of the table contains the squares of the yields in
column 1. The squares are treated m all respects like" third variable in
multiple regression. The regression equation, calculared as usual.
f ~ 17.703 - O.3415X 4- O.004075X',

is plotted in the figure. At small values of yield the second degree term
with its small coefficient is scarcely noticeable, the graph falling away
almost Ilke a straight line. Toward the right. however. the term in X' has
hent the curve to practically a horizontal direction.
TABLE 15.1.1
PERCENTAGE PRomN CONTENT (y) AND YIELD (Xl Of WHEAT
FROM 91 PLoTS-

Yield, Yield,
Bushel Percentage Bushel Per~ntage
Per Acre Square Protein Per Acre Square Protein
K K' Y K K' Y

43 1,849 10.1 19 361 13.9


42 1,764 10.8 19 361 11.2
39 1,521 10.8 19 361 13.8
39 11521 10.2 18 324 10.6
38 1,444 10.3 18 324 13.0
38- 1,444 9.8 18 324 13.4
31 1,369 10.1 18 324 13.1
37 1.369 10.4 18 324 13.0
l6 1,296 10.3 11 289 13.4
36 1.296 11.0 17 289 13.5
36 1,296· 12.2 11 289 10.8
35 1,225 10.9 17 289 12,5
35 1,225 12.1 11 289 12.1
14 1,156 10.4 17 289 13.0
14 1,156 10.8 11 289 n.8
14 1,156 10.9 16 256 14.3
14 1,156 12.6 !6 256 !3.6
33
32
1,089
1,024
10.2
11.8
16
!6//
.- 256
256
12.3
13.0
32 1,024 10.3 16 256 13.7
32 1,024 10.4 !5 225 13.3
31 961 12.3 15 225 12.9
31 961 9.6 14 196 14.2
31 961 11.9 14 196 13.2
31 961 11.4 12 144 15.5
30 900 9.8 12 144 13.1
~O 900 10.7 12 144 16.3
29 841 10.3 II 121 13.7
28 184 9.8 II 121 18.3
27 129 13.1 II 121 14.1
26 616 11.0 II Ul 13.8
26 676 11.0 II 121 14.8
25 625 12.8 10 100 15.6
25 625 II.K 10 100 14.6
24 516 9.9 9 81 14.0
24. 516 11.6 9 81 16.2
24 576 11.8 9 81 15$
24 576 12.3 8 64 15.5
22 484 11.3 8 64 14.2
22 484 10.4 8 64 13.5
22 484 12.6 1 49 1).8
21 441 13.0 7 49 14.2
21 441 14.7 6 36 1f>.2
21 441 11.5 5 2S Ib,2
21 441 11.0
20 400 12.8
20 400 13.0

• Read from published graph. This accounts for the slight discrepancy between the
correlation we got and thai reported by the author.
455
w,r------,r------.-------r-------r----~


~
~ ~: •

B
...
3'·r-----~~------_+~----~r_------+_------~

••
~.
.:.. • •
~ ~~ !..•
.. I·... ~
. .....___.. ,.
~ 101----------+------~~+_~·~r---"'~·~~.~.~·~.~;·i.~==::~--~
_. . .
0( 'r- -. •
l...Z
~
M

O~O----------~,O~------~~~O~------~30~---------40~--------~OO
V1El..O OF WHf:AT 1N ~U~HEL5 pl!!:e ACQ:!:.

FrG. 15.3.I-Regression of protein content on yield in wheat, 91 plots.


Y _ 17,703 - 0,)415X + 0,004{)75X'

The analysis of variance and test of significance are shown in table


15.3,2. The fitted regression on both X and X' gives a sum of squares of
deviations, 97.53, with 88 df. The sum of squares of deviations from a
linear regression, LY' - (:Exy)'/:Ex', is 110.48, with 89 df. The reduc-
tion in sum of squares, tested against the mean square remaining after
curvilinear regression, proves to be significant. The hypothesis of linear
regression is abandoned; there is a significant curvilinearity in the regres-
SlOD.
In table 15A.I, many of the values of X (e.g., X = 39) have two or
more values of Y. With such data, the sum of squares of deviations from
the curved regression (88 df.)can be divided into two parts so as to provide
a more critical test ofth. fit of the quadratic, The technique is described
in the following section. In the present example this technique supports
the quadratic fit.
TABLE 15.3.2
TEST OF SIGNIHCANCE OF 'DEPARTURE FROM LINEAR. REGRESSION

i Degrees o~ Sum of Mean


Source Ofvari~~~"__ ___
I FreedOm~._S~
T I' _ _ _ ~u~~re
Deviations from linear regress.ion 89 t 10.48
Deviations from curved regress.ion 88 97.53 1.11
~-----------_------ ~- ----------------------
.. ..

Reduction in sum of squares I 12.95 l2.95··


-----_. -_.--_._-_------------
F-12.95/1,11 -11.7
456 Chapter '5: Curvilinear Regrenion
The regression equation is useful also for estimating and interpo-
lating. Confidence statements and tests of hypotheses are made as in
chapter 13.
As always in regression, either linear or curved, one should be wary of
extrapolation. The data may be incompetent to furnish evidence of trend
beyond their own range. Looking at figure 15.2.1, one might be tempted
by the excellent fit to assume the same growth rate before the sixth day
and after the sixteenth. The fact is. however, that there were rather sharp
breaks in the rate of growth at both these days. To be useful. extrapola-
tion requires extensive knowledge and keen thinking.
EXAMPLE IS.3.l-The test of significance of departure from linear regression in
table 15.3.2 may also be_ used to examine whether a rectifying transformation. of the type
illustrated in section 15.2, has produced a straight line relationship. Apply this test [0 the
chick embryo data in table 15.2.1 by fitting a parabola in Xto log wei~hts Y. Verify that
the parabola is
y ~ -2.783162 + 0.214503X - O.000846X'.
and that the test works. out as follows:

Degrees of Sum of Mean


Freedom Squares Square

Deviations from linear regression 9 0.007094


Deviations from quadratic regression 8 0.006480 0.000810
Curvilinearity of regression 1 0.000614 0.000614

F = 0.76, with) and 8 d.f. When the X's are equally spaced, as in this example, a quicker
way of computing the test is given in section 15.6.

IS.4-Data having several Y'. at eaeh X value. If several values of Y


have been measured at each value X, the adequacy of a fitted polynomial
can be tested more thoroughly. Suppose that for each X. a group of n
values of Yare available. To illustrate for a linear model, if Y,} denotes
the jth member of the ith group, the linear model is
Yjj = ~ + {JX, + e'l' (15.4.1)
where the e'l follow %(0, ,,2). It fOllows that the group means. V,. are re-
lated to the X, by the linear relation
y,. = ~ + (JX, + ii,.
(1) By fitting a quadratic regression of the r,. on X" the test for
curvature in- table 15.3.2 can be applied as before. Since it is important
in what follows. note that the residuals e,. have variance ,,'In, since each
iii' is the mean of n independent residuals from relation 15.4.1.
(2) The new feature is that the deviations of the Y" from their group
means Y,. supply an independent estimate of ,,'. The pooled estimate is
k "
s' =I I (Y,) - f,.)'/k(n - 1)
i; I j= I
457
with kin - 1) dJ If we multiply the mean squares in analysis (1) by n.
in order to make parts (I) and (2) comparable. we have the analysis of
variance in table 15.4. L

TABLE 15.4.1
ANALYSTS Of VARIANCE FOR TESTS OF LINEAR REGRESSION

Source of Variation Degrees of Freedom Mean Square

Linear regression of Y,. on Xi 1


Quadratic regression of Y, on X, ' 1
Deviations of Y, from quadratic k-3
Pooled within groups, kin - 1)

Total kn - I

The following results are basic to the interpretation of this table.


If the population regression is linear, the mean square s/ is an unbiased
estimate of ([2; if the population regression is curved, .\'22 tends to become
large. If the population regression is either linear or quadratic. s/ is an
unbiased estimate of (J2. When will Sa 2 tend to become much larger than
(12? Either if the population regression is non-linear but is not adequatcl~
represented by a quadratic; for installce. it might be a third degree curve.
or one with a periodic feature: or if there are sources of variation That
are constant within any group but vary from group to group. This could
happen if the measurements in different groups were taken at different
times or from different hospitals or bushes. The pooled within·group
variance S2 is an unbiased estimate of (J2 no matter what the shape of the:
relation between Y;. and Xi' /---------
Consequently. first compute the F-ratio. s//s'. with (k - 3) and
ken - I) df. If this is significant. look at the plot of Y against X to see
whether a higher degree polynomial or a different type of mathematical
relationship is indicated. Examination of the deviations of the Y;. from
the fitted quadratic for signs of a systematic trend is also helpfuL If no
systematic trend is found. the most likely explanation is that some extra
between-group source of variation has entered the data.
If s/ l,2 is clearly non-significant. form the pooled mean square of
S,' and Sl' Call this s/' with (kn - 3)dj. Then test F = s,'/s.', with I
and (kn - 3) dJ. as a test of curvature of the relation.
The procedure is illustrated by the data in table 15.4.2, made avail-
able through the courtesy of B. J. Yos and W. T. Dawson. The point at
issue is whether there is a linear relation between the. lethal dose of
ouabain. injected into cats. and the rate of injection. Four rates were
used. each double the preceding.
First. the total sum of squares of the lethal doses 21. 744 is analyzed
into "between rates." 16.093. and "within rate groups." 5.651. Note
"i
that the number of cats differed slightly from group to group.
458 Chapter 15: Curvilinear R.gre"ion
TABLE 15.4.2
LETHAL DosE (MINUS SO UNITS) OF U $. STA..NDARD OuABAIN, BY Sww
INTRAVENOUS INJECTION IN CAT UNTIl. THE HEART STOPS

Xi = Rate of Injection in (mg./kg.jmin.)/J,045.7S Total

2 4 8

5 3 34 51
9 6 34 56
11 22 3l! 62
13 27 40 63
14 27 46 70
16 28 58 73
17 28 60 76
20 37 60 89
22 40 65 92
28 42
31 SO
31

:I.Yij= Yj • 217 310 435 632 1,594

r,.
:EY i/
12
18.1
4,727
II
28.2
10,788
9
48.3
22,261
9
70.2
45,940
41

83,716

The inequality in the n, must be taken into account in setting up the


equations for the regression of Y,. on X, and X/. Compute:
!:n,X, = 12(1) + 1I{2) + 9(4) + 9(8) = 142
!:n,X,' = 12(1) + 1l(4) + 9(16) + 9(64) = 776
!:n,X.'= 12(1) + 11(8) + 9(64) + 9(512) = 5,284
and similarly !:n,X: = 39,356. We need also
!:n,X, Y,. = !:X, y,. = t(217) + 2(310) +. 4{435) + 8(632) = 7,633
and !:X?Y,. = 48,865.
Each quantity is then corrected for the mean in the usual way. For
example,
!:n,(X? - .Xl)' = !:n,X,4 - (!:n,X,')'/!:n, = 39,356 - (776)'/41
= 24,668.8
!:n,(X, - X)(Y,. - Y .. ) = !:X,1'; - (!:n,X,)(1: Y,.)/!:n,
= 7,633 - (142)(1,594)/41 = 2,112.3

To complete the quantities needed for the normal equations, you may
verify that
!:n,(X, - J()' = 284.2, !:n,(X, - X)(X,' - X') = 2,596.4,
!:n;(X,' - X')(¥, - ¥ ) = 18.695.6
..59
The normal equations for b, and b, are:

284.2b, + 2,596.4b, = 2,112.3


2,596.4b, + 24,668.8b, = 18,695.6
In the usual way, the reduction in sum of squares of Y due to the regression
onb, andb, is found to be 16,082, while for the linear regression, the reduc-
tion is 15,700. The final analysis of variance appears in table 15.4.3.

TABLE 15.4.3
TESTS OF DEvIATIONS FROM LINEAR. A1'ffi QuADJlATJC' REGRESS10N

OegS'ttsof Summ M"t'!n


Source of Variation Fre~dom Squares Square

Linear regression on X I 15.700 15.700


Quadratic regression on X I 382 382
Deviations from quadratic I II 11
Pooled within groups 37 5.651 1S3

Total 40 21.744

The mean square II for the deviations from the quadratic is much
lower than the Within-groups mean square, though not unusually so for
only I df The pooled average of these two mean squares is 149. with 3t
d.f For the test of curvature. F = 382/149 = 2.56. with I and 38 d.[,
lying between the 25% and the 10% level. We conclude that the results"j',,(
consistent with a linear relation in the popUlation.
EXAMPLE 15.4.I~--Ttle following data. selected from Swanson and Smith (4) to pro~
vide an example with equal II, show the 10lal nitrogen content Y (grams per 100 cc. of
plasma) of rat blood plasrnll at nine ages X (days). ,
Age of
Ra' 25 37 50 60 80 100 130 180 360
---- --~~

0~83 0.98 1.07 1.09 0.97 1.14 1.22 1.20 1.16


0.77 0.84 1.01 1.03 1.08 1.04 1.07 1.19 1.29
0.88 0.99 1.06 1.06 1.16 1.00 1.09 1.33 1.25
0.94 0.87 0.96 1.08 1.11 1.08 1.15 1.21 1.43
0.89 0.90 0.88 0.94 1.03 0.89 1.14 1.20 1.20
0.83 0.82 1.01 1.01 1.17 1.03 1.19 1.07 1.06

Total 5.14 5.40 5.99 6.21 6.52 6.18 6~86 7.20 7.39

A plot of the Y totals against X shows tha.t (i) the Y values for X = 100 are abnormally
low and require special investigation, (il) the relation is clearly curved. Omit the dat<l for
X:: 100 and test the deviations from a parabolic regression against the Within-groups
mean square. ADS. F = 1.4.
460 Chapter 15: Curvilinear Regression

15.5-Test of departure from linear regression in covariance analysis.


As in any other correlation and regression work, it is necessary in co-
variance to be assured that the regression is linear. It will be recalled that
in the standard types of layout. one-way classifications, two-way classifi-
cations (randomized blocks) and Latin squares, the regression of Yon X
is computed from the Residual or Error line in the analysis of variance. A
graphical method of checking on linearity, which is often sufficient, is to
plot the residuals of Y from the analysis of variance model against the cor-
responding residuals of X, looking for signs of curvature.
The numerical method of checking is to add a term in X' to the
model. Writing Xl = X, Xl = X2. work out the residual or error sums
of squares of Y. X" and X" and the error sums of products of X,X,. YX,.
and YX 2 , as was illustrated in section 14.8 for a one-way classification.
From these data, compute the test of significance of departure from linear
regression as in table 15.3.2.
If the regression is found to be curved, the treatment means are
adjusted for the parabolic regression. The calculations follow the method
given in section 14.8.

15.6-0rlhogonal polynomials. If the values of X are equally spaced.


the fitting of the polynomial
Y = bo + h,X + h,X' + b,X' + '"
is speeded up by the use of tables of orthogonal polynomials. The es-
sential step is to replace X'(i = 1, 2, 3 ... ) by a polynomial of degree
i in X, which we will call X,. The coefficients in these polynomials are
chosen so that
:!:X, =0 : :!:X,Xj =0
where the sums are over the n values of X in the sample. The different
polynomials are ortflOgona( to one another. Explicit formulas (or tllese
polynomials are given later in this section.
Instead of ca],;:ulating the polynomial regression of Yon X in the
form above, we calculate it in the form:
Y= Bo + B,X, + B,X, + B,X, + ...
which may be shown to give the same fitted polynomial. On account of
the orthogonality of the X,, we have the results:
(i = 1, 2. 3 ... )
The values of the X, and of :!:X,' are provided in the tables. making the
computation of Bi simple. Further. the reductions in :!:l Y - Y)' due
to the successive terms in the polynomial are given by:
~l:X, Y)'jl:!:X.'); (:!:X,y)'j(l:X,'): (:!:X, n'/o;x,'I: and soon.
Thus it is easy to check whether the addition or a higher rower ,n X to the
461
polynomial produces a marked reduction in the residual sum of squares.
As a time-saver, the orthogonal polynomials are most effective when the
calculations are done on a desk calculator. With an electronic computer,
the routine programs for fitting a multiple regression can he used to fit
the equation in its original form. Most programs also provide the reduc-
tions in sum of squares due to each successive power.
Tahles of the first five polynomials are given in (5) up to n = 75, and
of the first six in (6) up to n = 52. Tahle A 17 (p. 572) shows these poly-
nomials up to n = 12. For illustration, a polynomial will he fitted to the
chick embryo data, though, as we saw in section 15.2, these data are more
aptly fitted as an exponential growth curve.
Table 15.6.1 shows the weights (Y) and the values of X" X" X" X"
X, for n = 11, read from table A 17. To save space, most tables give the
X, values only for the upper half of the values of X. In our sample these
are the values from X = II to X = 16. The method of writing down the X,
for the lower half of the sample is seen in table 15.6.1. For the terms of odd
degree, X" X" and X" the signs are changed in the lower half; for terms
of even degree, X, and X 4, the signs remain the same.

TABLE 15.6.1
FiniNG A FOURTH DEGREE Pol.YNOMIAL TO CHICK EMBRYO WEIGHTS

Age DryWt.
X y XI X2
------t-------t------.+--
(days) (grams)
6 ..
00'9 -5 15 - 30 6 - 3 0026
7
i
0.052 ! - 4 6 6 -6 6 I 0.056
8 I 0.079
I 0.125
-3 -I
i
I 22 -6
-I
I I 0.086
9 -2 -6 23 -4 0.119
10
I 0.181 -I I -9 14 4 -4 i
0.171
II 0.261
I 0.425
0 -10 i 0 6 0 I 0.265
12 '1 -9 -14 4 4 0.434
13 0.738 2 -6 -23 -I 4 0.718
14
15
I 1.l30
1.882
3
4
I -I
6
-22
-6
-6
-6
-I
-6
1.169
1.847
I
16 I 2.812 I 5 15
I 30 I' 6 3 2.822

:EX;l 110 858 4,290 286 I 156 I


I
-
;.; I I I I 5/6 1/12 1/40

31.873 1.315 I
I:XjY 7.714 25.858 39.768 -0.254

B, 0.701273 0,235073 0.046349 0.007430 0.004598 ,

We shall suppose that the objective is to find the polynomial of lowest


degree that Seems an adequate fit. Consequently. the reduction in sum
of squares will he tested as each successive term is added. At each stage,
462 Chapter 15: Curvilinear Regression
calculate
LX,Y, B, = LX, Y/tX/
(shown \IIlder table 15.6.1), and the reduction in sum. of squares,
(LX, Y)' /LX/, entered in table 15.6.2. For the linear term, the F-value is
(6.078511)/(0.232177) = 26.2. The succeeding Fvalues for the quadratic
and cubic terms are even larger, 59.9 and 173.4. For the X. (quartic)
term, F is 10.3, significant at the 5% but not at the 1% level. The 5th
degree term, however, has an F less than I. As a precautionary move,
we should check the 6th degree term also, but for this illustration we will
stop and conclude that a 4th degree polynomial is a satisfactory fit.

TABLE 15.6.2
REDUCTIONS IN SUM OF SQUARES DUE TO SUCCESSIVE TERMS

~
Degrees of Sum of Mean
Source Freedom Squares Square F
----~ -.-.-.--~-----.-~-- -
.. ..
Total, :1:( Y - f)' JO 8.168108
Reduction to linear I 6.078511
Deviations from linear 9 2.089597 0.ll2177 26.2
~-~~-~-

Reduction to quadratic I 1.843233


Deviations from quadratic 8 0.246364 0.030796 59.9

Reduction to cubic I 0.236803


Deviations from cubic 7 0.009561 0.001366 173.4
---~----+----
Reduction to quartic I 0.006046
Deviations from quartic 6 O.OO~~15 0.000586 10.3

Reduction to Quintic I 0'()OO414


Deviations from quintic 5 0.003101 0.000620 0.7

For graphing the polynomial, the estimated values Y for each value
of X are easily computed from table 15.6.1 :

Y = Bo + B,X, + B,X, + B,X, + B.X.


Note that Bo = Y= 0.701273. At X '" 6,
Y = 0.701273 - 5(0.235073) + 15(0.046349) - 30(0.007430)
-+ 6(0.004598) = 0.026,
and so on. Figure 15.6.1 shows the fit by a straight line. obviously poor.
the 2nd degree polynomial, considerably better. and the 4th degree poly-
nomial.
To express the polynomial as an equation in the original X variables
463

'Z.P
il

~ t.!>
"
:;
I- 1.0
:x:
$!
bJ
3=

-0.5
C ~ 4 eo \0 I"Z. 14 ItO
AGE IN DAY~

FIG. IS.6,I-Graphs of polynomials of first. ~nd, afld fourth degree fitted 10


chick embryo data of table 15.6.1.

is more tedious. For this, we need formulas giving X, in terms of X


and its powers. In the standard method, developed hy Fisher, hy which
the polynomial tables were computed, he started with a slightly different
set of polynomials e" which satisfy the recurrence relations

~o = 1 : ,,= X - X
These polynomials are orthogonal, but when their values are tabulated
lor each member of the sample, these values are not always whole num-
bers. Consequently, Fisher found by inspection the nlultiplier A, which
would make X, = A,e, the smallest set of integers. This !Oakes calculations
easier for the user. Tlte values of che ~,are shown under table 15.6.1,
and under each polynomial in table A 17 and in references (5) and (6).
Now to the calculations in our example. The llrst step is to multiply
each B, by the corresponding ).,. 'fhis gives
B,' = 0.235073; B,' = 0.046349; 8,' = 0.006192; B. = 0.0003832
The,e are the coefficients for the regression of Yon the ~f' so that
y~ Y+ B,'e, + B/e, + B,'e, + B.,. (15.6.1)
464 Chapter 15: Curvilinear Regression
The general equations connecting the ~,with X are as follows:

~l = X- X = x
n' - I
~,= x' - - _
12
~ 3 3"2 - 7
,,=x- 20 x
, 4 (3n' - 13) 1 3(n' - I)(n' - 9)
,. = x - 14 x + 560

~, = x' _ 5(n' - 7) x' + (15"4 - _230"' + 407] x


18 1,008

By substitution into formula (15.6.1), r is expressed as a polynomial in


x = X-x. If it is satisfactory to stop at this stage. there are two ad-
vantages. Further calculation is avoided, and there is less loss of decimal
accuracy. However, to complete the example, we note that n = II and
X = 11. Hence, in terms of X,
~1f'X-1I
~, = (X - II)' - 10 = X' - 22X + 111
~, = (X - II)' - 17.8(X - II) = X' - 33X' + 345.2X - 1,135.2
e. = (X - Ill' - 25(X - 11)' + 72
= X, - 44X' + 70lX' - 4,774X + 11,688
Hence, finally, using formula (15.6.1),

r= 0.701273 + 0.23507Z~I· + 0.046349~, + 0.006192e, + O.OOO3832~.


= 0.70i.73 + 0.235073(X - 11) + 0.046349(X' - 22X + 111)
+ 0.006192(X' - 33X' + 345.2X - 1,135.2)
+ O.0003832(X' - 44X' + 701X' - 4,774X + 11,688)
= 0.7099 - 0.47652X + 0.110636X' - 0.OI0669X' + 0.0003832X'

In table 15.6.1 tbere is a further shortcut which we did not use. In


computing :EXI Y, the y'g at the two ends of the sample. say Y, and Y l ,
are multiplied by 5 and - 5. Y,_ I and Y, are multiplied by 4 and -4.
If we form the differences, Y, - YI , Y, _ I - Y" and so on, only the set of
multipliers 5, 4, 3, 2. 1, need be used. This device works for any :EX, Y in
which i is odd. With i even, we form the sums Y, + YI' }~-l + Y" and
so on. The method is worked out for these data in example 15.6.1.
EXAMPLE lS.6.1-ln table 15.6.1, form the sums and differences of pairs of values of
Y, working in from the outside. Verify that these give the results "hown below. and that
the :EX; Y values are in agreement \\'ith those given 10 table 15.6.1
465

Sums x, X. Diffs. x, x,
0.261 -10 6 0.261 0 0
0.606 - 9 4 0.244 • 1 -14
0.863 - 6 -1 0.613 I 2 I
, -23
-6 -22
1.209
1.934
- 1
6 -6 1.051
1.830
I 3
4
j
- 6
2.841 15 6 2.783 5 30
I I
EXAMPLE 15.6.2~-Here are six points on the cubic, Y = 9X - 6X 2 + X 3 , (0.0).
(l, 4), (2, 2), (3, 0), (4, 4), (5. 20). Carry through the computations for fitting a linear,
quadratic, and cubic regression. Verify that there is no residual sum of squares after fitting
the cubic, and that the polynomial values at that stage are exactly the Y's.
EXAMPLE 15.6.3-The method of constructing orthogonal polynomials can be illus-
trated by finding Xl and Xl when n = 6.

(I)
_ _ _ 0.
(2) ~+~_o (5)

X ';1=X-X . X,=2e t ! '2 X, - !~,

1 -5/2 -5 10/3 5
2 -3/2 -3 -2/3 -1
3 -1/2 -1 -8/3 -4
4 1/2 1 -8/3 -4
5 3/2 3 -2/3 -1
6 5/2 5 10/3 5

Start with X = 1. 2, 3, 4. 5, 6, with X = 7/2. Verify that the values. of ~l = x = X - X are


as shown in column (2). Since the';, are nor whole numbers, wetake..l. t = 2, giving XI = 2{1'
column (3). To find ';2. write
'Z={12_b{I-C

This is a quadratic in X. We want I:~2 = O. This gives


I:~12 - b1:'1 - I1C = 0 i.e .• ¥- 6c - () c =H
Further. we want 1:"<2 = 0, giving
1:~13'_ bt~12 - cl:~\ =0
Hence, ~2 "'" '1 2 -H· Verify the {2 values in column (4). To convert these to integers,
multiply by ;.2 = j.

lS.7-A general' method of fitting non-linear regressions. Suppose


that the population relation between Yand X is ofth. form

Y, = f(~. fl. y, X;) + B, (i = I, 2.... n)

where/is a regression function containing Xi and the parameters r:r., p, t'.


(There may be more than one X-variable.) If the residuals B, have zero
means and constant variance. the least squares method of fitting the regres·
466 Chapfer J5: Curvilinear R."reuion

sion i~ to l:stimate the values of the~. fl.)' by mininuzmg


,
L [>; - f(lX, p, ;', Xi)]'
;:1

This section presents a general method of carrying out the calculations.


The delails re~uire a knowledge of parlial differentiation, but the ap-
proach is a simple one.
The ditlicully arises nol because of non-linearity in Xi but because of
non-linearity in one or more of the parameters Cl, P. i1• The parabola
(a + /iX + i'X') is fitted by the ordinary methods of multiple linear re-
gression. because it is linear in ex, /1. and y. Consider the asymptotic regres-
sion.:t: -/j('-/(). If the value of)' were known in advance. we CQulq. write
X, = J". The least s~uares estimales of a and /i would then be given b)
fitting an ordinary linear regression of Yon X I' When:' must be estimated
from the data. however. the methods of linear regression cannot be ap-
plied.
The first step in the general method is to obtain good initial estimates
a J • hI' c.' of the final least-square estimates &. fl. y. For the common
types Qf non-linear functions. various techniques for doing this have been
developed. sometimes graphical, sometimes by special studies of this prob-
lem. Next. we use Taylor's theorem. This states that if 1\1X, p, y, Xl
is continuous in IX, p. and )', and if (a - a,). (P - btl. and (y - (',) are
small.
/(IX. p, y, Xi) =/(a" hI' c,' Xi) + (IX - alf~ + (fJ - b,)/; + (y - e,)/;
The symbol =
means "is approximately equal to." The symbolsf.J.,/;
denote the partial derivatives off with respect to IX. p. and )', respectively,
evaluated at the point a" hI' Ct. For example. in the asymptotic regres-
sion.

we have

Since a" hI' and ", are known. the values of I.f.'/•. and" can be
calculated for each member of the sample, where we have written ffor
/(0" b" C,' Xi)' From Taylor's theorem, the original regression relation
Y, = f(a, p, y,X,),+.,
may therefore be written, approximately,
Y, =f+ (a - utl/~ + (fJ - h,)!. + (y - cdJ; +.; (15.7.1)
Now write
Y... =Y-f; X,=f.; X,=j;,; XJ=J;
From equation 15.7.1,
Y,,, '" (IX - o,)X, + ({J - h,)X, + {y - c.)X, + e, (15.7.2)
The variate Y", is the residual of Y from the first approximation. The
relation (15.7.2) represents an ordinary linear regression of Y", on the
variates X" X" X" Ihe regression coefficients being (a - a,), (ft - hI)
and (y - <,). If the relation (15.7.2) held exactly instead of approximately,
tbe computation of the sample regression of Y, .. on' X,. X" X, would
give the regression coefficients (& - a,). (fl- b,). and (9 - e,). from
which the correct least squares estimates <2. fl. and 9 would be obtained
at once.
Since relation (15.7.2) is approximate, the fitting of this regression
yields second approximations 0" b" and c, to <2, p, 9. respectively. We
then recalculate f, /., f. and}; at the point h" 0,. 0,.
finding a new Y, ..
and new variates X,. X,. and X 3 • The sample regression of this Y, .. on
Xl. X" and X3 gives the regression coefficients (03 - a,). (b 3 - b,) and
(C3 - <,) from which third approximations 03, b 3• C3 to~, p, ? are found.
and so on.
If the process is effective, the sum of squares of the residuals. 1: Y •2 •
should decrease steadily at each stage. the decreases becoming small
as the least-squares solution is approached. In practice. the calculations
are stopped when the decrease in 1: Y, ..2 and the changes in a. h. and care
considered small enough to be negligible. The mean square residual is
5' = 1: Y,a'/(n - k).
_---' -

wbere k is tbe number of parameters that have been estimated (ii(our


example, k = 3). Witb non-linear regression. 52 is not an unbiased esti-
mate of ,,2.
though it tends to become unbiased as n becomes large.
Approximate standard errors of the estimates <2. fl. 9 are obtained in
the usual way from the Gauss multipliers in Ihe tinal multiple regression
that was computed. Tbus,
s.e. (<2) '" s.jc ll ; s.e. <p) '" s.je,,;
S.e. (9) '" • .je"
Approximate confidence limits for" are given by (Ii ± (s.jc ll ) where (
has(n - 3)df
If several stages in the approximation are required. the calculations
become tedious on a desk machine. since a mUltiple regression must be
worked out at each stage. With the commonest non-linear relations.
however. the computations lend themsdves readily to programming on an
electronic computer. Investigators with access to a computing center
are advised to find out whether a program is available or can be con-
,tructed. If the work must be done on a desk machine, the importance of
a good first approximation is obvious.

lS.8-Fitting aD asymptotic regression. The population regression


function will be written (using the symbol p in plaoe of y)

30
468 Chapter 15: Curvilinear R_ess""
j(~, p, p, X) = ~ + P(px) (15.8.1)
If 0 < fl < I and (J is negative, this curve has the form shown m figure
15.I.I(e), p. 448, rising from the value (~ + (J) at X = 0 to the asymptote
~ as X becomes large. If 0 < p < I and P is positive, the curve declines
from the value (<I + /J) at X = 0 to an asymptote <I when X is large.
Since the function is non·linear only as regards the parameter p, the
method of successive approximation described in Ihe preceding section
simplifies a little. Let', be a first approximation to p. By Taylor's
theorem,
'" + jJlpx) ,; '" + jJ(r,') + jJlp - ")(X,,x- ')
Write Xo = I, X, = ,,-t, X, = X'I X
-
1
• Ifwe fit the sample regression
f = aX. + bX, + eX, (15.8.2)

it follows that a, b are second approximations to the least·squares esti·


mates~, p, of", and pin (15.8.1), while

e = b(" - ")'
so that
" = " + c/b (15.8.3)
is the second approximation to p.
The commonest case is that in which the values of X change by unity
(e.g., X '= 0, 1,2 ... or X = 5,6,7 ... ) or can be coded to do so. Denote
the corresponding Y values by Y•• Y" Y, • ...• Y. _ ,. Note that the value
of X corresponding to Y. need not be O. For" = 4, 5, 6, and 7. good first
approximations to P. due to Patterson (7), are as follows:

"=4. ,,=(4Y3 + Y,-5Y,)/(4Y, + Y,-5Y.)


n = 5. " = (4Y. + 3Y3 - Y, - 6 Y,)/(4Y 3 + 3Y, - Y, - 6 Yo)
"=6. ,,=(4Y,+4Y.+2Y3 -3I',-7Y,)/(4Y,+4Y,+2Y,-3Y,
:.. 7Yo)
"=7. ,,=(Y.+ 1',+ Y.- Y,-2Y,)/(Y, + Y.+ 1'3- Y,-2Y o)
In a later paper (8). Patterson gives improved first approximations for
sample sizes from" = 4 to n = 12. The value of '" obtained by solving
a quadratic equation. is remarkably good in our experience.
In an illustration given by Stevens (9). table 15.8.1 shows six consecu·
tive readings of a thermometer at half-minute intervals after lowering it
into a refrigerated ho!d .
. From Patterson's formula (above) for n = 6, we find " = \0.42/
- 18.86 = 0.552. Takingr l = O.55,~omputethesamplevaluesof X, and
X, and insert them in table 15.8.1. The matrix of sums of squares and
469

TABLE 15.8.1
OAT A FOR. FlmNO AN AsYMPTOTIC REGRESSION

X Y
Time Temp. X,~ X,~ f, Yre.:='
(1/2.runs.) 'F. (0.55') X(0.55"') y- P,

0 57.5 1.00000 0 57.544 -0.044


I 45.7 0.55000 1.00000 45.525 +0.175
2 38.7 0.30250 1.10000 38.892 -0.193
3 35.3 0.16638 0.90750 35.231 +0.069
4 33.1 0.09151 0.66550 33.211 -0.111
5 32.2 0.05033 0.45753 32.096 +0.104

Total 242.5 2.16072 4.13053 +0.001

products of the three X, variates is as follows:


l:Xo' = 6 l:XoX, = 2.16072 l:XoX, = 4.13053
l:XoX, = 2.16072 l:X/ = 1.43260 I:X,X, = 1.11767
l:XoX, = 4.13053 l:X,X, = 1.11767 I:X,' = 3.68578
(Alternatively, we could use the method of sections 13.2-13.4 (p. 381),
obtaining a 2 x 2 matrix of the :Ex,x), but in the end litlle time is saved
by tbis.)
The inverse matrix of Gauss multipliers is computed. Each row of
tbis matrix is mUltiplied in turn by the values of :EX, Y (placed in the right-
band column).

Inverse matrix I.X,Y

ell"'" 1.62101 ('12 = -1.34608 ell == -1.40843 242.5


Cet"'" - L34608 Cll = 2.032l2 C13 = 0.89229 104.86457
ttl "" - 1.40843 ('ll = 0.89229 (.'3) = 1.57912 157.06527

These multiplications give


a = 30.723; b = 26.821; c = b(" -,tl = 0.05024 (15.8.4)
Hence,
" = " + c/b = 0.55 + 0.05024/26.821 = 0.55187
The second approximation to the curve is
f, = 30.723 + 26.821(0.55187)X (15.8.5)
In order to judge whether the second approximation is near enough
to the least-squares solution, we find 1: Yr. . / for the first two approxima-
tions. The first approximation is
f', = ", + 6,(0.55 X ) = 0, + b,X, (15.8.6)
410 Chapter 15: Curvilinear Regression
where ai' hI are given oy the linear regn::~sion of Yon Xl' In tht pre-
ceding calculations. u. and hi were not computed. since they urc not
needed in finding the second approximation. Howcv~r. by th~ usual
rules for linear regression. 1:r... ,
~ (rom the first approximatIon is given by

115.R.7)

where, as usual. Xl = XI - X I' When the curve fits closely. as in this


example, ample decimals must be carried in this calculation, as Stevens
(9) has warned. Alternatively. we can compute ", and h, in (15.8.6) and
hence Y - Y,. obtaining the residual sum of s4uarc, dlfectly. With the
number ofd<.."Cimuls that we carrie..'tl. we obtained O.09XX by fo,.mula 15.R.7
and 1).0990 by thedirecl melhod.lhe former figure being Ihe more"ccurale.
For Ihe second approximation. compute the powers of ''; = 0.55187.
and hence find Y, by (IS.X.5). The values of Y, and of Y - Y, arc shown
in table 15.8.1. The sum of squares of residuals is 0.0973. The decrease
from Ihe first approximation (IU)988 to O.Il973) is so small thaI we may
safely stop with the st!l.:ond approximation. Further approximations
lead to a mi01111u111 of 0.0972.
The Re..~idua.J mean square for the second approximation is
,'=,0'097313 = 0.0324. with n - J = 3 d.J: Approximale slandard
et~ots for the estimated parameters afe (using the inverse matrix):

.I.e.lu,) ~ '''';''" = ±0.23; s.•. lb,' = S.,./C22~= ±0.26;


.1.".(1',1 ~ s.if;,/h, = 0.226/26.82 = ±O.OOR4

Strictly speaking. the values of the Cii should be calculated for r = 0.55187
instead of r = 0.55. but Ihe above results are close enough. Further. since
'2 - '1 = c/b. a better approximation to the standard error of '2is given
by the formula for Ihe standard error of a ratio.

tn nearly all cases, the term clJ/c 2 in the square root dominates. reducing
the result to s.../~/h.
When X has the values O. 1.2 ..... (n - I). desk machine calculation
of Ihe second pproximalion is much shorlened by auxiliary tables.
The (j; and ci.j in the 3 x 3 inverse matrix that we must compute at each
stage depend only on nand r. Sleven, (9) tabulaled Ihese values for
n = 5. 6, 7. With Ihese tables. the user finds the first approximation f,.
and computes the sample values of X, and X, and the quantities I: Y.
:LX, Y. I:X, Y. The values of the ('ij corresponding to /', are Ihen read
from Stevens' tables. and the second approximations are obtained rapidly
as in (15.8.4) above. Hiorns (10) has tabulated the inverse matrix for f
going byO.OI from 0.1 to 0.9 and for sample sizes from 5 to 50.
EXAMPLE 15.8.1-ln an experiment on wheat in Australia. fertilizers were applied at
a series of levels with these resu1ting yields.

Level x o 10 20 30 40
Yield y 26.2 30.4 36.3 37.8 38.6

Fit a Mitscherlich equation. Ans. Patterson's formula gives = 0.40. The second'1
approximation is r2 = 0.40026. but the residual sum of squares is practically the same as for
r
the first approximation, which is = 38.679 - 12.425(0.4)x.

EXAMPLE 15.8.2-in a chemical reaction. the amount of nitrogen pentoxide decom-


posed at various times after the start of the reaction was as follows (12"

Time <n I~ -
~ _ _~5_ _ _ _6_ _ _7_ 3

Amount Decomposed (~~ __2_2~ __ ~ __2_7._2___2_9_.1___30_.1

Fit an asymptotic regression. We obtained f= 33.S02 --26.69S(0.7S3)T, with residual


S.S. ~ 0.105.
EXAMPLE 15.S.3-Stevens (9) has remarked that when p is between 0.7 and t. the
asymptotic regression curve is closeJy approximated by a second degree polynomial. The
asymptotic equation Y = I - O.9(O.S)' takes the following values:

x o 2 3 4 5

y 0.)00 0.280 0.424 0.53· 0.631 0.705 0.764

Fit a parabola by orthogonal polynomials and observe how well the values of Yagree.

REFERENCES
l. R. PEARL. L. J. REED, and J. F. KISH. Science. 92:486. Nov. 22 (1940).
2. R. PENQtJlTE. Thesis submitted for the Ph.O. degree. Iowa State College (1936).
J. W. H. METZGER. ./. Amer. Soc. Agran., 27:65309.15}
4. P. P. SW .... NSON and A. H. SMITH. J. Bioi. Chern., 97:745 (1932).
5. R. A. FISHER and F, YATES. Statistical Tahll'.~. Qliver and Boyd. Edinburgh. 5th ed.
(1957).
6. £. S. PEARSON and H. O. HARTLEY. Biometrika TaMeslor Stat;sliClan~. Vol. 1. Cam-
bridge University Press (1954).
7. H. D. PAtTERSON. Biotnetrics.12:323(1956).
8. H. D. PATTERSON. Biometrika. 47;177 (1960).
9. w. L. SnvE:-.JS. Biometrics, 7:247 (1951).
10. R. W. HIORNS. The Filtin1: of GrOldh and Allied Cunle.~ of the Asymptotic Regression
Type hy Slerem'J Method. Tracts for Computers No. XXV)IJ Cambridge Uni·
versity Press (1965).
II. E. A. MITSCHERLICH . . Landi!". Jahrh .. 38: 537 (1909)
12. L.J. REEDandE.J. THERIAULT. J. Pltysica/Chenr.. ')S:9SO{19JI).
* CHAPTER SIXTEEN

Two-way classifications with


unequal numbers and
proportlOns

16;1-lntroductiOll. for one reason or another the numbers of


obsorvations in the individual cells (suh-dasses) of a multiple classitica- .
lion may be unequal. This is the situation in many non-expcrimcntal
~tudies. in which the investigator classities his sample according to the
factors or variables of interest. exercising no control over the way in which
the-numbers fall. With a one_way classification. the handling of the "un·
equal.numbers" case was di.scussed in section IO.l:!. In this chapter we
present methods for analyzing. a two-way classification. The related
problem of analyzing a proportion in a two-way tahle will be taken up
also.
The complications introduced by unequal suh-clas. number> can be
illustrated by a simple example. Two diets were compared on samples of
10 rats. As it happened. 8 of the 10 rats on Diet I were females. while only
2 of the 10 rats on Diet 2 were females. Table 16.1.1 shows the sub-class
totals for gains in weight and the silh-class numbers. The 8 females on
Diet I gained a total of 160 units, and SO on.
TABLE 16.1.1
TotAL GAINS IN WEIGHT ANO Sl!A-('lASS Nl}_\f8FJt~ (ARTifiCIAL DArA)

Female~ M<tle:. Sums Means

\ Tota.h 1100 M 220 ~2


Diet I
'l Number; 8 I 10
I Totals 30 ~O() 230 23
Diet 2
1Numbers 2 ~ 10

{TotalS 190 260 4~O


Sums
Numbers 10 10 20
Means 19 26 22.S

From these data we obtain the row totals and means. and likewise the
column totals and means. From the row means. it looks as if Diet 2 had
412
473
a slight advantage over Diet I, 23 against 22. In the column means, males
show greater gains tban females, 26 against.19.
The sub-class means per rat tell a different story.

Female Male
---~-.---
Diet I 20 30
Diet 2 IS 2S

Diet 1 is superior by 5 units in both Females and Males. Funher,


Males gain 10 units more than Females under botb diets, as against the
estimate of 7 units obtained from the overall means.
Why do tbe row and column means give distoned results? Clearly,
because of the inequality in the sub-ciass numbers. The poorer feed,
Diet 2, had an excess of the faster-growing males. Similarly, the compari-
son of Male and Female means is biased because most of the males were
on the inferior diet.
lfwe attempt to compute the analysis of variance by elementary meth-
ods, this also runs into difficulty. From table 16.1.1 the sum of squares
between sub-classes is correctly computed as
(160)2 + (60)2 + (30)2 + (200)2 _ (450)2 = 325 (3 df)
8 2 2 8 20
The sum of squares for Diets, (230 - 220)'/20, is 5. and that for Sex
(260 - 190)2/20, is 245,Ieaving an Interaction sum of squares of 75. But
from the cell means there is obviously no interaction; the difference be-
tween the Diet means is the same for Males as for Females. In a corree!
analysis, the Interaction sum of squares should be zero.
For a correct analysis of a two-way table the following approacb
is suggested :
I. First test for interactions: methods of doing this will be described
presently.
2a. If interactions appear negligible, this mean> that an additive
model

is a satisfactory fit, where X;j. is the mean of tbe n;; observations in the
Ith row and jth column. Proceed to find the best unbiased estimates of
the <x, and Pj'
2b. If interactions are substanllal, examine the row effects separately
in each column, and vice versa, witb a view to understanding the nature of
the interactions and writing a summary of the results. The overall row
and column effects become of less interest. since the effect of each factor
depends on the leyel of the other factor.
Unfortunately, with unequal cell numbers the exact test of the null
hypothesis that interactions are absent requires the solution of a set of
474 Chopter 16: Two-way CI_illcafion. with Unequal Numbers and
linear ~4uations like those in a multiple regression. Consequently. before
presenting the exact test (section 16,7) we first describe some quicker meth-
ods that are often adequate, When interactions are large, this fact may
be obvious by inspection. or 'Can sometimes be verified by one or two
I-test,. as illustrated in section 16.2, Also. the exact test can be made by
simple method. if the cell numbers n" are (i) equal. (ii) equal within any
row or within any column. or (iii) proportional that is. in the same pro-
portion within any row, If the actual cell numbers can be approximated
reasonably well by one of these cases. an approximate analysis is obtained
by using the actual cell means, but replacing the cell numbers n" by the
approximations, The three cases will be illustrated in turn in sections
16,1. 16,3. and 16,4,
The fact that elementary methods of analysis still apply when the
the cell numbers are proportional is illustrated in table 16,1.1, In this.
thecell means are exactly the same as in table 16, I, I. but males and females
are now in the ratio I: 3 in each diet. there being 4 males and 12 females
on Diet 1 and 1 male and 3 females on Diet 2, Note that the overall row
means show a superiority of 5 units for Diet 1, just as the cell means do.

TABLE 16.1.2
Elc.A!04PlE OF PROt*OR110NAI. SUB-CU.SS NUMBfRS

Females Male!!. Sums


Totals Numbers Total!! Number" TOIal!. Numbers

Diet I 20W 12 110 4 J6() 16


Means 20 30 22.5

Diet 1 45 3 15 1 70 4
Me;.tns 15- 25 17,5

Sums 285 IS 145 5 430 20


Means 19,0 29,0 21.5

Analysis of Variance
Correction term C = (430)2/20 = 9,245

Dqrees of Freedom Sum of Squares

(36())' (70)'
Rows -t6- + - 4- - c
(145)' + (285)' _ C
Columns
5 15
lnteractions By subtraction o
(1201' (45)'
Between sub-classes 3 - - + ... + -.-~ - C = 4SS
4 3
475
Similarly, the overall column means show that the males gained 10 units
more per animal than females. In the analysis of variance, the Interac-
tions sum of squares is now identically zero.

16.2-Unweighted analysis of cell means. Let X;}. denote the kth


observation in the cell that is in the ith row andjth column, while X;j. is
the cell mean, based on nu observations. In this method the Xu' are
treated as if they were all based on the same number of observations when
computing the analysis of variance. The only neW fea!_ure is how to in-
clude the Within-cells mean square s' = LLL(X,j' - Xu' )')LL(nu - 1)
in the analysis of variance.
With fixed effects, the general model for a two-way classification may
be written
(16.2.1)

where IX, and Pj are the additive row and column effects, respectively.
The Ii} are population parameters representing the. interactions. The I,j
sum to zero over any row and over any column, smce they measure the
extent to which the additive row and column effects fail to fit the data in
the body of the two-way table. The £'1' are independent random residuals
or deviations, usually assumed to be normally distributed with zero
means and variance ,,'. It follows from 16.2.1 thai for a cell mean,
Xu' = jj + Ct, + Pj + 1,1 + "11"'
where i'l' is the mean of n;j deviations.
The variance of Xii' is ,,')nu' Consequently, if there are a rows ano:l'
b columns, the average variance of a cell mean is .

,,' (_!__ + _1_ +


ab nil n 12 . . .
+..!..) ~ ,,,"II ,
n."

where n, is known in mathematics as the harmoTiic. mean of the nu. A


table of reciprocals helps in its calculation. The Within-cell mean square
is entered in the analysi~ of variance as S2/n".
Our example (table 16.2.1) comes from an experiment (I) in which
3 strains of mice were inoculated with 3 isolations (i.e., different types) of
the mouse typhoid organism. The nu and the Xu' (mean days-to-death)
are shown for each cell. The unweighted analysis of variance is given
under the table. From the original data, not shown here, s' is 5.015 with
774 df Since I In, was found to be 0.01678, the Within-cells mean square
is entered as (0.01678)(5,015) = 0.0841 in the analysis of variance table.
The unweighted analysis may be used either-as the definitive analysis,
or merely as a quick initial test for interactions. As a fimil analysis the
unweighted method is adequate only if the disparity in the n;j is small-
say within a 2 to 1 ratio with most cells agreeing more closely. Table 16.2.1
476 Chapt.r 16: Two-way CI<milkcrtian. with UMqual Numh.rs tmtI
TABLE 16.2.1
CELL NL-M8ERS AND MFAN DAYS-To-Dt-_ATH I~ THREE STRAINS Of MICE INOCULATED
WnH THR£.E {so\"Al\ONSOl-lH£ TYPHOID BACH.I.US

---r Strain of Mice


--~---::-:----"--=-~-='_"'_-=---=

Isolatic-n
II RI Z B. Sums

9D

IIC
11"
X". '.
4.0000
no
6.4545
)1

"
4.0323

0.7821
33
.1.7516
III
4.3097
11.1899

17.5403
DSC 1 107 1:\3 188
6.626': 7.8045 4.1277 18.5584

Sums 11.0807 18.6189 12.1950 47.8940

Anai),si!> of Variance of Unweignted Means

Degrees of Freedom Sum of Squ<ttes Mean Square

Isolations 2
___---------------------------------
8.8859
Strains 2 7.5003
Interactions 4 3.2014 0.8004··
Wilhin c;elJs 774 00841

~
fl.
= ~ (_!_ +
9 34
...... + -'-.) = 0.01678.
188
n~ = 59.61

does not come near to meeting this restriction: the n 'j range from 31 to 188.
However. this experiment is one in which the presence of interactions
would be suspected from a preliminary glance at the data. It looks as if
strain Ba was about equally susceptible to all three isolations. while
strains RI and Z were mOre resistant to isolations lIe and DSCI than to
9D. In this example the unweighted analysis would probably be used
only to check this initial impression that an additive model does not apply.
The F-ratio for Illleractions 15 0.8004/0.0841 = 9.51 with 4 and 774 df..
significant at the I·"·~ level. Since the additive model is rejected. no com-
parisons among row and column means seem appropriate.
For subsequent Hests that are made to aid the interpretation of the
results. the method of unweighted means. if applied strictly. regards every
cell mean as having an error variance 0.0841. This amounts to assuming
that every cell has a sample size n. = 59.61. Howe\'er. comparisons
among cell means can be made without assuming the numbers to be equal.
For instance, in examining whether strain Z is more _resistant to DSCI
than to IIC. the difference in mean days-to-death is 7.8045 - 6.7821
= 1.0224. with standard error
Jr~~;~~~-:1~3) = ];0.119
Proportion. 477
so that the difference is clearly significant by a I-test. Similarly. in testing
whether Sa shows any differences from strain to strain in m~an days to
death, we have a one-way classification with unequal numher~ per class
(see example 16.2.1).
If interactions had been negligible. main effeCls would be estimated
approximately from the row and column means of the sub-class means.
These means can also be assigned correct standard errors. For inst~ncc.
for 90 the mean. 11.7899/3 ~ 3.9300. has a standard error

/(5.015)('--'-- _1_ _ I)'


" 9 34 + 31 + 33

In some applications it is suspected that the Within-sub-class vari·


ance is nol constant from onc sub-cJa5~ to another. Two changes in the-
approximate method are suggested. In the analysis of variance. compute
the Within-classes mean square as the average of the quantities .'ii/nij.
where .'\1/ is the mean square within the i. j sub-class. In a comparison
I.L;/<jj. among. the sub-class means. compute the standard error as

J"L'
"" ;} So ";n;j
using only the sub-cla5se:-. {holl enter into the comparison.
EXAMPLE 16.2.I-fc\1 whether Ba shows any differences from strain to strain in
mean days·to·death. Ans. The Ba totals are 124.487. 776. for sample sizes 33. 113: f88.
The weighted slim of squares I!. I(151)5. with 1. d.(. The mean square. 4.021( as compared with
the Within-class me<l.n ~quare ..'i ,() I 5. shows no indic<l.tion of an) difference

16.3-Equal numllers within rows. In the mice example (table 16.2.1).


an analysis that assumes equal sub-class numbers within each row approxi-
mates the actual numbers much more cfosely than the assumption that
all numbers are equal. Since the row total numbers are 98. ~57. and 428.
we assign sample sizes 33. 86. and 143 to the sub-classes in the respective
rows.
In the analysis (table 16.3.1). each sub-class mean is multiplied by the
assigned sub-class number to form a corresporiliil)g sub-class total. Thus.
for Z with 90, 133·1 = (33)(4.03~3). The analysis of variance. given
under table 16.3.1. is computed by elementary methods. Each total.
when squared, is divided by the assigned sample size.
The F-ratio for Interactions is 8.70. again rejecting the hypothesis of
additivity of Isolation and Strain effects. In this example. the assigned
numbers agree nearly enough with the actual numbers so that further 1-
tests may be based on the assigned numbers. If the interactions had been
unimportant in this example, the main effects of Isolations and Strains
would be satisfactorily estimated from the oyerall means 3.930. 5.849.
and so on. shown in table 16.3.1. (These means were not used in the pres-
ont calculations.)
478 Chapter 16: Two-way ClassifICation. with Unequal Numbers and
TABLE 16.3.1
ANALYSIS Of" MI('E DATA BY EQUAL NUMBERS WITHIN Rows
(Assigned numbers iii sub-class means XII' and corresponding totals, niX,j')

Means

3.930

5.849

6.186

Correcti.on: C = (4.5SL9}2(786 = 26.361.060


(132.0)' (590.3)'
Between Sub·classes: ~33 +.. + _- - C = I 730.22
143 '
· (389.1)' (1,509.0)' (2,653.8)'
1soIallons: - - _ + --- + - C,. 410.56
99 258 429

trains; 0.634.61' + (1.832.4)' + 0.084.9)' _ C _ 1 145.10


S 262 .

AR(liysis of Variance

Degr«s of Freedom Sum of Squar~ Mean Square F

Isolations 2 410.56 205.28


Strains 2 1.145.10 572.55
lnteractions 4 174.56 43.64 8.70

Between sub-classes 8 1,730.2~


Within sub-classes 774 5.015

Although this method requires slighlly more calculation than the


assumption of equal numbers. it is worth considering if it produces
IWmbers near to the actual numbers.
16.4-Proportional sub-class numbers. As mentioned in section 16.1,
the least squares analysis can be carried out by simple methods if the sub·
class numbers are in the same proportions within each row. Points to
note are:
(i) The overall row meanS, found by adding all the observations in a
Proportions 479
I'm", and dividing by the sum of thl! sub-class numbers In the row. are the
kast sljuares estimates of the row main effects, and similarly for columns.
(ii) In computing the analysis of variance. the squared total for any
SUb-class. row. or column is divided by the corresponding numhCL The
Total sum of squares betwecn sub-classes and the sums or ~4uarcs for
rows and columns are calculated directly. the Interaction sum of squares
hcing found by subtraction.
(iii) The F-ratio of the Interactions mean square to the Withlll sub-
classes mean square gives the exact least squares test orthe null h~"pothesis
that there are no interactions.
Tv. 0 exampks will be presented. In table 10.4.1 theclassl,.."\ an.' HreedS
of Swine and Sex of Swine. The sub-class numbers. represent appro."\i
mately the proportion ... in which the breeds and sexes were brought in for
slaughter tit thl..' ('ol1ege Mcab Lahoratory (~). For each breed. males and
females arc in the proportions 2: I. and for each sex, the breeds are in the
proportions 6: 15; 2; J: 5. The data are the percentages of dressed weigh!
to total weight (less 701> II)' The calculations are given in full under the
table. Since the sample represents only a sma If fraction of the original
data, conclusions are tentative. There were differences among brecd~ but
no indication of a sex difference nor of sex-breed interactions. In mi.iking
comparison.;;, .Ullong the breed me<:lns, aCcount should o( course be taken
orthe differences in the sample sizes.
in the hreed means. the sexes arc weighted in the ratio of2 males to i
female. The reader may ask : h this the weighting that we ought to have'}
The answer depends on the statuS nf the interactions. If interactions are
negligible. <lily weighting. provided that it is the same for every breed.
furnishes unbiased estimates of the population differences between breed
means. The 2: I weighting gives the most precise estimates from the avail~
able data. If interactions are present. breed differences are not the same
for males as for females. so that different weightings produce real differ-
ences in results. Usually. as emphasized on several occasions. we do not
wish to examine main effects when interactions are present. If we do. a
2: I weighting is appropriate. when interactions are present. only if it
represents the proportions in which ~ales and females appear in the
target population of the study. as happens in this example. Equal
weighting or some other proportion would be preferred if it were more
typical of the popUlation about which the investigator wishes to draw
conclusions.
\\lith unequal sub-clas~ numbers. the expressions for the expected
values of the mean squares in lcrl1l~ of components of variance are com-
plicated. Wilk and Kempthorne (31 have developed formulas for 2- and
3-fai.:tor arra ngcmcnts: the sub-class number::. may be equal or proportion-
aL With.2 facrors, let the proponions in factor A be U t :U 2 :· :Ull
and those in B. 1"\ :"2: ... :1"". The number of observations in the (i,j)
sub-class will then be some multiple OfUi")' say nujcj. Note the value of
n. The mathematical model is as given to 16.2.1. where the 'i.
f3 j and
TABLE 16.4.1
DREssJNG PERcENTAGES (LIIss 70%) OF 93 SWINE CLA.'mFIED BY hEED AND SEx.
LIVE WEIGHTS 200-219 PouNDs

2 3 4 5
~-----r------4-------~-----r-----­
I
Number !Male Female Male Female' Male Female Male Female i Male Female

1 13.3 18.2 I 10.9 14.3 13.6 12.9 11.6 13.8 10.3 12.8
2 12.6 11.3 i 3.3 15.3 13.1 14.4 13.2 14.4 10.3 8.4
3 11.5 14.2' 10.5 11.8 4.1 ; 12.6 4.9 10.1 10.6
4 15.4 15.9 I 11.6 11.0 10.8 15.2 6.9 13.9
5 12.1 12.9 15.4 10.9 14.1 13.2 10.0
6 15.7 15.1 14.4 10.5 12.4 11.0
7 13.2 11.6 12.9 12.2
8 15.0 14.4 12.5 13.3
9 14.3 7.5 13.0 12.9
10 16.5 10.8 7.6 9.9
11 15.0 10.5 12.9
12 13.7 14.5 12.4
13 10.9 12.8
14 13.0 10.9
IS 15.9 13.9
16 12.8
11 14.0
18 11.1
19 12.1
20 14.7
21 12.1
22 13.1
23 10.4
24 11.9
25 10.1
26 14.4
21 11.3
28 13.0
12.7
It
29
30 11.6
i
tx 168.9 87.6 1362.7 182.7 41.6 27.3 79.7 33.1 110.1 55.7
Total: N - 93, I:X _ 1,149.4, I:X' - 14,785.62
Bre<d Sums: 1: 256.5, 2: 545.4, 3: 68.9, 4: 112.8, 5: 165.8
Sex Sums: Male. 763.0; Female, 386.4

1. Correction: C = (I;X).tIN = (1,149.4)1/93 = 14.205.60


2. Total: tx' - c - 14,785.62 - 14,205.60 - 580.02
(168.9)' (87.6)' (55.7)'
3. Sllb--classes: ----- + - - - + ... + --5- - C '>:: 122.83
12 6
4. Within suh-c1asses' 580.02 - 122.83 ::: 457.19
(7630)' (386.4)'
5. Sex - 62 + ---3'- - (' = 0.52
(256.S)' (165.8)'
6. Breeds: -- -+ ... -+- --'-~ - C = 97.38
18 IS
7. Interaction: 122.83 - (97.38 + 0.52) = 24.93
481

Degrees of Freedom Sum of Squares Mean Square

Sex I 0.52 0.S2


Breeds 4 97.38 24.34··
Breed-Sex Interactions 4 24.93 6.23
Within sub-classes 83 457.19 5.51

Breed Mean Percentages

2 3 4 5
84.2 82.1 81.5 82.5 81.1
n,. 18 4S 6 9 15

f;j may be either fixed or random. Also let:


1:,.2
U = 1:u, V = 1:v, U. = 1:u' , V·_--
(1:U)2 - (1:(,)2
The expected values of the mean squares are:

E(A) _- (J
2 + nUV(l - U·)- {(V. - -
a-I b
I) (J

B
2+ 0',.. 2}

E(B) _
- U
2 + --b---l--
nUV(1 - V·) {(U. - a1) (1 AS
2+ (J B
2}

E(AB _ u2 ~UV(1 :::_Y*)(I - V·) 2


)- + (a - I)(h _ I) U AS

These results hold when both factors are fixed. If A is random, delete the
term in LOa (inside the curly bracket) in E(B). If B is random, delete the
term in lib in E(A). With fixed factors, the variance components are
defined as follows:
u/ = 1:~.'j(a - I) : uo' = 1:fJ/!(b - I) : u Ao' = 1:f;/i(a - I)(b - I)

For the example, if A denotes sex and B denotes breed:


a= 2. b= 5;"1 =2, U2 = I; t\ =6, "2 = IS, '" =2, 1'4=3, V, = 5; n= 1
2' + I' 6' + ... + 52
U=3: V=31; U*=--~~·-=0556·
3' . , V· -- 31' =0311
.
Regarding sex and breed as fixed parameters, we find
E(A) = u' + 4.S8u A.' + 41.3u A'
E(B) = u ' + O.90u As' + 16.0u/
E\ARI = a' + 7.\\a A . '
482 Chopter 16: Two-way Classilkations with Unequal Numb.... and
Note that E(A) and E(R) contain terms in the interaction variance, even
though all effects are fixed. This happens because when the numbers are
proportional, the main effects are weighted means. Although the I,j sum
to zero over any row or column, their weighted means are not zero. As
a further illustration, you may verify that if A were random in these data,
we would have:
E(R) ~ ,,2 + 890u..i + 16.OuB '

Our second example (table 16.4.2) illustrates the use of analysis by


proportional numbers as an appro xi mat jon to the least squares analysis.
In a sample survey of farm tenancy in an Iowa county (4). it was found that
farmers had about the same proportions of Owned, Rented, and Mixed
TABLE 16.4.2
FARM ACItFS IN CORN CLASSIFIED BY TENURE AND SoiL PRODUCTIVITY
AVDUBON COUNTY, IOWA

Renter Mixed
-----~-

Ob- Propor- Ob- Proper-


[
serv~ tlonal s.erved tlonal l:X

67 62.92 49 52.33 1~2 .


55.2 50.6
3.473 2.648 7,323

60 57.95 49 48.20 140


53.4 47.1
3.095 2.270 6,584

n 58 54.40 87 93.13 80 77.47 225


III X 30.1 46.8 40.1
l:X 1.637 4,358 3.107 9,102

125 214 178 517


4,058 10,926 8.025 23,009

Analysis of Variance USlDg Proportional Numbers

Source of Variation Degrees of Freedom Sum of Squares Mean Square


------------~---------- --------__
6,635
Soils 2 3.318"
Tenures 2 27.367 J3,684'"
Interactions 4 883 221
Error (from original data) 508 830

Means

Owner Renter Mixed


32.5 51.1 45,1

I II III
48.2 47.0 40.5
Proportion. 483
farms in 3 soil fertility classes (section 9.13). Replacement of the actual
sub-class numbers by numbers that are proportional should therefore
give a good approximation to the least squares analysis. The proporl1onal
numbers are calculated from the row and column totals of the actual num-
bers. Thus, for Renters in Soil Class III, 93.13 = (225)(214)/517. The
sub-class means are multiplied by these fictitious numbers to produce the
sub-class totals :EXin table 16.4.2.
The variable being analyzed is the number of acres of corn per farm.
There are large differences between tenure means, renters and mixed
owner-renter farmers having more corn than owners. The amount of
corn is also reduced on Soil Class [II. There is no evidence of interactions.
Since the proportional numbers agree so well with the actual numbers,
an exact least squares analysis in these data is unnecessary. In general,
analysis by proportional numbers should be an adequate approximation
to the least squares analysis if the ratios of the proportional to the actual
cell numbers all lie between 0.75 and 1.3, although this question has not
been thoroughly studied.
16.5-Disproportionate numbers. The 2 x 2 table. In section 16.7
the analysis of the R x C table when sub-class numbers are neither equal
nor proportional will be presented. The 2 x 2 and the R x 2 table, which
are simpler to handle and occur frequently, are discussed in this and the
next section. Table 16.5.1 gives an example (5). The data relate to the
effects of two hormones on the comb weights of chicks.

TABLE 16.5.1
CoMB WEIGHTS (wo.) Of Lars OF CHICKS INJECTED WITH Two SEx HORMONES

Untreated HonnoneA
Number :EX x Number :EX x
Untrea.... 3 240 80 12 1,440 120
HormoneB 12 1,200 100 6 672 112

The Within-classes mean square, computed from the mdividual


S2
observations, was = SII. with 29 df. To test the interaction, compute
it from the sUb-class means in the usual way for a 2 x 2 factorial:
SO + 112 - 100 - 120 = -2S
Taking account of the sub-class numbers, the standard error of this esti-
mate is

JS2(~3 + 6~ +..!..12 + ..!..)


12
= iSII) ~ 3
= ±23.25

The value of tis -2S/23.25 = -\.20, with 29 df" P about 0.25. We shall
aSSllme interaction unimportant and proceed to compute the main effects
(table 16.5.2).

81
484 CItapter 16: T__ way C/aailkatioM willi Ihtaqual NurnlHon ...I
TABLE 16.5.2
CALClJLATION Of MAIN E.PFEcrs OF HOJtMONB A AND B

Untreated Hormone ... D.- W,,=


111"2 _
n. K, ft, K. .1'04 - X.
III + 112
Untreated 3 80 12 12(J 40 2.4 915
HonnoneB 12 100 6 112 12 4.0 48

w. D. W. D. 6.4 144
2.4 20 4.0 -8

Main effect of A: l:_W"D.. /l: W" = 144/6.4 = 22.5


S.E. - ,;?lEW. ~ )811/6.4 _ ± 11.26 (29 d,{.)
Main effect' of B: 1: W.D./E W. ~ 16/6.4 ~ 2.5
S.E. - J,'/l:W. = J811/6.4 = ± 11.26 (29 df.)

Consider Hormone A. The di~ences D A between tbe means witb


and without A are recorded separately for the two levels of B. These are
the figures 40 and 12. Since interaction is assumed absent, each figure
is an estimate of the main effect of A. But the estimates differ in precision
because of the unequal sub-class numbers. For an estimate derived from
two sub-classes with numbers n 1 and n2 the variance is

a 2(11)
-+- =11 2(n,-#n2)
"1"2 "t n2
Consequently, the estimate receives a relative w.eight W = n , n2/(n, + "2)'
These weights are COffiIluted and recorded. The main effect of A is the
weighted mean of the two estimates, LWD/LW, with s.e. ± ,jS'/LW.
The main effect of B is cq_mputed similarly. Tbe increase in comb weigbts
due to Hormone A is 22.5 mg. ± 11.26 mg., almost significant at tbe 5%
level, but Hormone B appears to have little effect.
Note: in this example the two values of W, 2.4 and 4.0, happen to be
the same for A and B. This arises because two sub-classes are of size
12 and is not generally true. We have not described the analysis of vari-
ance because it is not needed.
16.6-Disproportionate numbers. The R x 2 table. The data in table
16.6.1 illustrate some of the peculiarities of disproportionate sub-class
numbers (6). In a preliminary analysis of variance, sbown under the
table, the Total sum of squares between sub-class means and the sums of
squares for Sexes and Generations were computed by the usual elementary
methods (taking account of the differences in sub-class numbers). The
Interactions sum of squares was then found to be
119,141 - 114,287 - 5,756 = -902
Proportions 485
The Sexes and Generations S.S. add to more than the total S.S. between
,un-classes. This is because differences between the Generation means
are inflated by the inequality in the Sex means, and vice versa.

TABLE 16.6.1
NUMBER, TOTAL GAIN, AND MEAN GAIN IN WEIGHT OF WISTAR RATS (OMS. MINUS 1(0)
IN FOUR SUCCESSIVE GENERATIONS. GAINS DURING SIX WEEKS F.Jt.OM 28 DAYS OF AGE

D,-
,

Jt.j=
Genera· Male Female nlJ n 1,j
lion -,) Xl). XIj. n" Xlj' XZi' "lj + "2j Xu· - X2}, Wp,
I 21 1,616 76.95 27 257 9.52 1l.81 67.43 796.35
2 15 922 61.47 25 352 14.08 9.38 47.39 444.52
3 12 668 55.67 23 196 8.52 7.89 47.15 372.01
4 7 497 71.00 19 129 6.79 5.12 64.21 3206
-

34.20 1.941.64

Preliminary Analysis of Variance

Source of Variation j Degrees of Freedom Sum of Squares Mean Square

Sexes
Generations
I , I
3
114,287
5,756
Interactions 3 -902(!)

Between sub-classes 7 119,141


Within sub-classes I 141 409

Calculation of Adjusted Generation ~eans

Generation n., X. j •
X· i · Estimate of i Adjusted Mean

\ 48 1,873 39.02 J.C. +"1 - tSjl6 42.57


2 40 1,274 3US p. + Ct l - bjg 38.9,
3 35 864 24.69 Il + 113 - 116/70 33.61
4 26 626 24.08 Kf (.(4 - 36/13 37.18

In any R x 2 table the correct Interactions S.S. is easily computed


directly. Calculate the observed sex difference D and its weight W
separately for each generation (table 16.6.1). The Interactiom S.S.
(3 df.) is given by
I:WD' - (I: WD)'/I:W = (67.43)(796.35) + ... + (64.21)(328.76)
- (1,941.64)2/(34.20) = 3,181
The F-test of Interactions is F = 1,060/409 = 2.59, close to the 5%
level. It looks as if the sex difference was greater in generations 1 and 4
than in generations 2 and 3. There is, however, no a priori reason to antici-
pate that the sex difference would change from generation to generation.
Perhaps the cell means were affected by some extraneous sourca or varia-
486 Chapler 16: Two-way CI_iRealions with Unequal Numbers ancl
tion that did not contribute to the variation within cells. For illustration,
we proceed to estimate main effects on the assumption that interactions
are negligible.
The estimate of the sex difference in mean gain is
D = I:W;D;lI:»j = 1,941.64/34.20 = 56.77 gms.
Its S.E. is JS'7I;-w = J 409/34.2 = 3.46 gms.
To estimate the Generationeffects, note that under the additive model
the population means for males and females in Generationjmay be written
as follows.
Males: /l+aj+!b; Females: /l + aj - !b
where <> represents the sex difference, Males minus Females. We start
willi the unadjusted mean for each generation and adjust it so as to re-
mOVe the sex effect. Since generation I has 21 males and 27 females out
of48. its unadjusted mean is an unbiased estimate of

/l
21(b) 27
+ a, + 48 2 + 48 (
-
b)
2 = p + a, - 16
b
Our estimate of b is 56.77 and the unadjusted mean for generation I is
39.02. To remove the sex effect, we add 56.77/16 = 3.55, giving 42.57.
These adjustments. are made at the foot of table 16.6.1.
For comparisons among these adjusted generation means, standard
errors may be needed. The difference between the adjusted means of the
jlh and kth generation is of the form
X. j . - X .•. + gD,
where 9 is the numerical multip\i!'f 01 D, lhe variance 01 this difference is

S2(.!...n. +.!... +L \
j no" I:W)
With generations I and 2, n., = 48, n., = 40, while 9 = (-1/16) - (-1(8)
= 1/16, and 1: W = 34.2. The term in 9 in the variance turns out to be
negligible. The variance of the difference is therefore

(409)Us + ~) = 18.73

The adjusted difference is 3.62 ± 4.33.


If F-tests of the main effects of Sexes and Generations are wanted,
start with the preliminary S.S. for each factor in table 16.6.1. Subtract
from it the difference:
Correct Interaction S,S. minus Preliminary Interaction S.S,
= 3,181 - (-902) = 4,083
Proportions 487
The resulting adjusted S.S. ar~ shown In table 16.6.2.

TABLE 16.6.2
ADJUSTED SUMS OF SQUARes OF MAIN EFFECTS

Source of Variation II Degrees of Freedom Sums of Squares Mean Square


----_._---
Sexes (adjusted) i 1 114.287 - 4.083 ~ 110.204 110.204··
Generations (adjusted) II 3 5,756-4,083~ 1,673 558
Interactions 3 3,181 1,060
Within sub<lasses 141 409

The sex difference is large. bUI Ihe generation differences fall short of
the 5% level.
If interactions can be neglected. this analysis is applicable to tables
in which data are missing entirely from a sub-class. Zeros are entered
for the missing nij and Xi j .• From the df for Interactions deduct I for
each missing cell.
EXAMPLE 16.6.1- ·(i) Verify from table 16.6.2 that the adjusted .5.5. for Sexes. Gen-
erations; and Interactions do not add up to the Total 5.S, between sub-classes. (ii) Verify
that the adjusteds.s. for Sexes. 110,204, can be computed directly (apart from rounding
errors), as (tWD)l,rr.W. This formula hold~ in all R)( 2 tables.
An additive analysis of variance can be obLained from the Preliminary 5,5, for Genera-
tions and the adjusted 5.S. for Sexes, as follows:

Degrees of Freedom Sum of Squares

Generations (ignoring Sexes) 3 5,756


Sexes (adjusted for Generations) I 110,204
Interactions ) .. f 3,181

- - ---------+---_.
Total between sub-classes 7 119.141
._--- ---------------
This breakdown is satisfactory when we art interested only in testing Sexes. Alternatively.
we can get an additive breakdown from Sexes (ignoring Generations) and Generations
(adjusted).
EXA MPLE ! 6.6.2 ·-·Becker and Hall (IO} determined t~ number of oocysts produced
by rat" of five strains during' immunization with Eimeria miyairii. The unit of measurement
is lOb n{)cy<;b

Strain

Sex Lambert lo Hi W.E.L. Wistar{AI

Male n 8 14 20 8 9
X 36.1 94,9 194.4 64.1 175.7

Fe.ma\("" n J 14 21 III 8
X 319 68.6 \87.3 89.:': 148.4
4BB Chapt., 16: T__wgy CI-;II<",ioM will> Unequal Nu~,.. and
Verify the completed anaJysis of variance quoted from the original article:

Sex (adjusted) 1 2,594.6 2,594.6


Strains (adjusted) 4 417,565.6 104,391.4
Interaction 4 8,805.3 2,201.3
Within sub-classes* 109 332,962.9 3,054.7

• You cannot, of course, verify this line.


You will not be abfe to duplicate these numbers exactly because the means are reported to
only 3 significant digits. Your results should approximate the first 3 figures in the mean
squares, enough for testing.

16,7 -The R x C table. Least squares analysis. This is a general


method for analyzing 2-way classifications (7). It fits an additive model
(i.e., one assuming no interactions) to the ,ub-class means:
Xlj' ;: JJ + at + Pj + Ejj., i = 1, ... r,j = 1, ... c,
where tbe ei)' are assumed normally distributed with means zero and vari-
ances (12/nji , where n ij is the sub-class number. This amounts to assuming
that the variance within each subcla$s is (J2, since Bjj. is the mean of nij
such residuals.
As an intermediate step in the calculations, the method provides the
most powerful test of the null hypothesis that interactions are zero. If
this hypothesis is contradicted by the data, the calculations are usually
stopped and the investigator proceeds to examine the two-way table in
detail. Iftbe assumption of negligible interactions is tenable, the remain-
der of the calculations give unbiased estimates of the row and column
main effects a i and jJj that have the smallest variances. Since data of this
type are common and are tedious to handle on a desk machine, most com-
putingcenters are likely to have a standard program for the analysis.
The basic data used are the nu and the row (Xi") and column (X. j .)
totals of the observations. Table 16.7.1 shows the algebraic notation and
the mouse typhoid data of tirhl.e 16.2.1 used as illustration. (The Po
are explained later.) Following Yates (7), we denote the row and column
totals of the no by N,. and N. j'
The least squares method chooses estimates m, ai. hj of 1', a i• jJj that
minimize
L, LJ nij (Xij' - m - Qi - by
The resulting normal equation for Q, is
Nj,(m + ail + nnbl + nj2 b2 + '" + niche = Xj •• (16.7.1)
Thus, for Organism 9D, we have
98(m + a,) + 34b, + 31h, + 33b, = 385
Note that the least squares method estimates a i • the effect of the ith row,
by making the observed total for the ith row equal the value which the
model says it ought to have. Similarly, for the jth column,
(16.7.2)
489
TABLE 16.7.1
ALGEBRAIC NOTATION AND {lATA f'OIt FITTING THBADDITlvE MODEL

COIUJJlDS Data
I 2 C Totals Totals

n"
PI} "'" nIl/Nt.
n"
p"
."
p"
n"
p"
N,.
I
Xl"

n" n" n" N,. Xl'

.P.,,
ft"
Pl}" nliNl' p" p" p" I
n" n" n" N, . x. ..
= n"IN,. p" p" p" I

N., N., N., N ..


Data totals X· t • X· 2 • X.~. X..

Strain of Mice r
Orprusm RI Z Ba I, N;. X; .. Xi"
9D titj

p"
34
0.34694
31
0.31633
33
0.33673 I
I 98
I
385 3.929

lie n" 66 78 113 I 257 1.442 5.611


p" 0.25681 0.30350 0.43969 I 1
DSCI "Jj 107 133 188 428 2,523 5.895
P" 0.25000 0.31075 0.43925 1
-
N., 207 242 334 783
X. j _ 1,271 1,692 1,387 4,350 5.556
hj 2.125l 2.8986 0

From (16.7.1) we see that if we know the b's, we can find (m +'a,), while
if we know the a's, (16.7.2) gives (m + b j ). The next step is to eliminate
either the a's or the b's. Time is usually saved by eliminating the more
numerous ~et of constants, though an investigator interested only in the
rows mayprefeT toeliminate thecolumns . . III this example, withr = c = 3,
it makes no difference. We shall eliminate the a's (rows).
When the a's are eliminated, m also disappears. In tinding the equa-
tions for the b's, it helps to divide each no; by its row total tv ,. forming the
PIj. The equations for the b's are derived by a rule that is easily remem-
bered. The tirst equation is
(N'l - "II'PI1 - .. - f1,tP,t)bl, - (n 11 P12 + .. + l1,IPrl)n Z - ...
- (n11Pl.I: + ., + "rlPu)b.c = X, l , - P11X t ·· - " - PtlXr"
For the mice, the tirst equation is
[207 - (34)(0.34694) - .' - (107)(O.25OOO)Jb,
- [(34)(0.31633) + .. + (I07)(0.31075)Jb,
- [(34)(0.33673) + .. + (107)(0.43925)Jh,
=,1,271 - (0.34694)(385) - .. - (0.25000)(2,523)
In the jth equation the term in .bj is N. j minus the sum of products of tbe
490 Chapter 16: Twa-way Claraillcalions with Unequal Numbers cmcI
n's and p's in that column. The term in b. is minus the sum of products
of the nfj and the p". The three equations are:
151.505b, - 64.036b, - 87.468b, = 136.35
-64.036b, -t 167.191b 2 - 103.155b, = 348.54 (16.7.2a)
-87.468b, - 103.155b, + 190.624b, = -484.92
The sum of the numbers in each of the four columns above adds to zero,
apart from rounding errors. This is a useful check.
In previous analyses of 2-way tables in this book, we have usually
assumed Lb, = O. In solving these equations it is easier to assume b, = O.
(This gives exactly the same results for any comparison among the b's.)
Drop b, from the first two equations and drop the third equation, solving
the equations
151.505b, - 64.036b, = 136.35
-64.036b, + 167.191b, = 348.54
The inverse of the 2 x 2 matrix (section 13.4) is
0.0078753 0.0030163)
( 0.0030163 (16.7.3)
0.0071365
giving
b, = 2.1251 : b, = 2.8986 : b, =0 (16.7.4)
The sum of squares for columns, adjusted for rows, is given by the sum
of products of the b's with the right sides of equations (16.7.2a).
Column S.S.(adjusted) = (2.1251)(136.35) + (2.8986)(348.54) = 1,300
The analysis of variance can now be completed and the Interactions
S.S. tested. Compute the S.S. Between sub-classes and the unadjusted
S.S. for Rows and Columns, these being, respectively,

LLXu.'/n,! - C; LX, .. 2/N,. - C; LX/IN.! - C;


where C: X .. .'IN... The results are shown in tbe top half of table
16.7.2.
In the completed analysis of variance, the S.S. Between sub-classes,
1,786, can be partitioned either into
Rows S.S. (unadjusted) + Columns S.S. (adjusted) + Interactions S.S.
or into
Rows S.S. (adjusted) + .Columns S.S. (unadjusted) + Interactions S.S.
Since we now know that Rows S.S. (unadjusted) = 309 and Columns S.S.
(adjusted) = 1,300, the first of these relations gives the Interactions S.S. as
'''_'ion. 491
1.786 - 309 - 1,300 = 177
The df. are (r - 1)( C - I) = 4 in this example. The second relation pro-
vides the Rows S.S. (adjusted). The completed analysis of variance ap-
pears in the lower half of table 16.7.2.
TABLE 16.7.2
ANALYSIS Of V.'\,RJ,o\NCf OF THE MJCE DATA

Preliminary (Unadjusted)
Source of Variation Degrees of Sum of Squares Mean Square
Freedom

1,786
309
1,227

Completed
_._---,-------_._-----
Rows (Organisms), unadjusted
Columns (Strains), adjusted
2
2
309)
1,300 1,609 650.0
Rows (Organisms), adjusted
Columns (Strains), unadjusted ,
2
2
J82}
1,227 1,609
191.0

Interactions 177 44.2


Within sub-classes j 77: 5.015
----- ' - -
As in the approximate analyses, interactions are shown to be present
so that ordinarily the analysis would not be carried further; the data
would be interpreted as in section 16.2. But to illustrate the computations
we proceed as though there were no interaction. The mean squares for
F-tests of the main effects of rows and columns are the adiusted mean
squares in table 16.7.2.
The standard error of any comparison 'i.Libj among the column
main effects is
~:------­
s.j('i.Lj (j) + 2'i.L j L,Cj,)
where s = .j5.015 = 2.24 and the ej , are the inverse multipliers in (16.7.3).
Since D, was arbitrarily made 0, all cj ' are O. As examples,
S.£.(b, - b 2 ) = 2.24.j[.00788 + .00714 - 2(.00302)J = ±O.212
S.E.(b, - b,) = 2.24.j.00788 = ±O.l99
The row main effects can be obtained from (16.7.1), rewritten as
m + a, = X, .. - p"h, - ... - p"b, (16.7.5)
In table 16.7.1, the X,. are in the right-hand column and the.bj are at the
foot of each column. Relation (16.7.5) gives
In + a, = 3.929 - (0.34694)(2.1251) - (0.31633)(2.8986) = 1275
492 Chapter 16: T......wcry cltmilkaliottl witIr IIMquQI Nun*G.. and
Similarly, we find
m + a, = 4.186 : rrr + a, = 4..0163
From (16.7.5) any comparison l:Li(m + ail among the row means is
of the form
:EL,x, .. - l:uJb j
To find the variance of this comparison, multiply 52 by

"L/
t... - + "
~ u, "
j ejJ + 2~ WJUIi;CJA;
N ,
i

For example, the difference a, - a, = 1.911, is


X,,, - X", + O.09Olb , + 0.0l28b,
The mUltiplier of 5' is, therefore,
I I
98 + 257
+ (0.0901)'(0.00788) + (0.0128)'(0.00714)
+ 2(0.0901)(0.0128)(0.00302) = 0.01417
The s.'. is ± .J(5,QI 5)(0.01 417) = ±0.266.
In the original data the overall mean is X ... = 4350/783 = 5.556
(table 16.7.1). Our three estimated row means are all less than 5.556.
This is a consequence of the choice of b, = 0 to simplify the arithmetic.
Although this choice has no effect on any comparison among the row
means m + a i or the column means m + bi' it is sometimes desirable to
adjust the m + a, and the m + hj so that m becomes X.... To do this,
calculate the weighted mean of them + a, with weights N i .; that is,
[(98)(2.275) + (257)(4.186) + (428)(4.463)]/783 = 4.098
Since X ... = 5.556, we'add + 1.458 to each m + ai' giving 3.733,
5.644,
and 5.921 for the row means. To make the column means average in
the same way to the general mean, compute these means as X". + bj
- 1,458, giving the values 6.223, 6.997, 4.098.
In a 3-way classification the exact methods naturally become more
complicated. There are now three two-factor interactions and I three-
factor interaction. An example worked in detail is given by Stevens (8).
The exact analysis of variance can still be computed by elementary
methods if the sub-class numbers are proportional, that is, if
nij, = (Ni .. )(Nj.)(N.. ,)/N.. !

Federer and Zelen (9) present exact methods for computing the sum
of squares due- to any main effect or interaction, assuming all other effects
present. They also describe simpler methods that provide close upper
Proportion. 493
bounds to these sums of squares. Their methods are valid for any number
of classes.
EXAMPLE 16.7.1-10 the farm tenancy example in section 16.4 there was no evidence
of interaction. The following are the least squares estimates of the main effect means for
tenure and soihi.

Owner: 32.507 Renter: 51.072 Mixed; 45.031

48.157 II 46.999 III 40.480

Your results may differ a little, depending on the number of decimals carried. The results
above were adjusted so that I:Nj.a j = "£N. jh j = O. Note the excellent agreement given by the
means shown under table 16.4.2 for the method of proportional numbers.
EXAMPLE 16.7.2-10 the mice data verify the following estimates and standard
errors as given by the use of equal weights within rows (section 16.3) and the least squares
analysis (section 16.7).

Equal Within Rows Least Squares

lie - 90 1.919 ± 0.265 1.911 ± 0.266


Z-RI 0.755 ± 0.196 0.774 ± 0.212

16.8-The analysis of proportions in 2-way tables. In chapter 9 we


discussed methods of analysis for a binomial proportion. Sections 9.8-
9.11 dealt with a set of C proportions arranged in a one·way classification.
Two·way tables in which the entry in every cell is a sample proportion
are also common. ~xamples are sample survey results giving the per~
centage of voters who stated their intention to vote Democratic, classified
by the age and income level of the voter, or a study of the proportion of
patients with blood group 0 in a large hospital, classified by sex and type
of illness.
The data consist of rc independent values of a binomial proportion
Pij = gij/nji' arranged in r rows and ccolumns. The data resemplethose
in the preceding section, but instead of having a sample of continuous
measurements X jjk (k = l. 2, ... nij) in the i, j cell, we have a binomial
proportion Pi)- The questions of interest are usually the same in the bi-
nomial and the continuous cases. We want to examine whether row and
column effects are additive, and if so, to eslimate them and make com-
parisons among rows and among columns. If interactions are present,
the nature of the interactions is studied,
From the viewpoint of theory, the analysis of proportions presents
more difficulties than that of normally distributed continuous variables.
Few exact results are available. The approximate methods used in prac·
tice mostly depend on one of the following approaches.
I. Regard Pu as a normally distributed variable with variance
PijQi)l1ij' using .the weighted methods of analysis in preceding sections,
with weights H'ij = l1 i/PIjQjj and Pij replacing X ij .•
494 Chapter 16: Twa-way Classi/lcatians with Unequal Numbe.. and
2. Transform thepijto equivalent angles Yij (section 11.16), and treat
the} i; as normally dlstnbuted. SlOce the vanance of y for any p .. is
approximately 821/nij, this method has the advantage th~1 if the ft· '~re
constant, the analysis of variance of the Y'; is unweighted. As we 'have
seen, thIs transformatIon IS frequently used in randomized blocks experi-
ments In which the measurement is a proportion.
3. Transform Pijto its logit 1';; = log. (P;;!qi;)' The estimated vari-
ance of 1';j IS approxImately I/(ni;p,jqij), so that in a logit aAalysis, 1';)
IS given a weIght nijPijqij'
The assumptions involved in these approaches probably introduce
little error in the conclusions if the observed numbers of successes and
failures, nijPij and nijqij' exceed 20 in every cell. Various smaJJ-sampJe
adjustments have been prepared to extend the validity of the methods.
When all Pi; lie between 25~·~ and 75~~, the results given by the three
approaches seldom differ materially. If, however, the Pij cover a wide
range from close to zero up to 50% or beyond, there arc reasons for ex-
pecting that row and column effects are more likely to be additive on 8
logit scale than on the originalp scale. To repeat an example cited In sec-
tion 9.14, suppose that the data are the proportions of cases in which the
driver of the car suffered injury in automobile accidents classified by
severity of impact (rows) and by whether the driver wore a seat belt or not
(COlumns). Under very mild impacts P is likely to be close to zero for both
wearers and non-wearers, with little if any difference between tho two
columns. At the other end, under extreme impacts, P will be near 100%
whether a seat belt was worn or not, witl) again a small column effect.
The beneficial effect of the belts, if any, will be revealed by the accidents
that show intermediate proportions of injuries. The situation is familiar
in biological assay in which the toxic or protective effects of dlfTere(lt agents
are being compared. It is well known that two agents cannot be etlec-
tively compared at concentrations for which p is close to zero or 100',%',;
instead, the investigator aims at concentrations yielding p around 50%.
Thus, in the scale of p, row and column effects cannot be strictly addi-
tive over the whole range. The logit transformation pulls out the scale
near 0 and IOO/~, so that the scale extends from - '>C to + 00. In the
logit analysis row and column effects may be additive. whereas in the p
scale for the same data we might have Interactions that are entirely a coo-
sequem;e of the scale. The angular transformation occupies an inter-
mediate position. As with logits, the scale is stretched at the ends, hut the
total range remains finite, from 0° to 90'].
To ~mmmarize, with an analysis in the original scale it is easier to
think about the meaning and practical importance of effects in this scale.
The advantage of angles is the simplicity of the computations if the "ij are
equal or nearly so. Logits may permit an additive model to be used in
tables showing large effects. In succeeding sections some examples will
be given to illustrate the computations l'or analyse~ in the original and the
logit scales.
Proportions 495
The preceding analyses utilize observed weights, the weight W = n/pq
attached to the proportion p in a cell being computed from the observed
value of p. Instead, when fitting the additive model we could use expected
weigh" a' = n/pq, where j) is the estimate given by the additive model.
This approach involves a process of successive approx.imations. We guess
first approximations to the weights and fit the model, obtaining second
approximations to the p. From these, the weights are recomputed and the
model fitted again, giving third approximations to the p and the IV, and
so on until no appreciable change occurs in the results.
This series of calculations may be shown to give successive approxi-
mations to maximum likelihood estimates of the [> (11). When np and
nq are large in the cells, analyses by observed and expected weights agree
closely. In small samples it is not yet clear that either method has a con-
sistent advantage. Observed weights require less computation.
A word of caution: we are assuming that in any cell there is a single
binomial proportion. Sometimes the data in a cell come from several
binomials with different p's. In a study of absenteeism among clerical
workers, classified by age and sex, the basic measurement might be the
proportion of working days in a year on which the employee was absent.
But the data in a cell, e.g., men aged 20--25, might come from 18 different
men who fall into this cell. In this event the basic yariable is Pi;" the pro-
portion of days absent for the kth man in the i, j cell. Usually it is ade-
quate to regard PiJ' as a continuous variate, performing the analysis
by the methods in preceding sections.
16.9-Analysis in the p scale: a 2 x 2 table. In this and the next
section, two examples are given to illustrate situations in which direct
analysis of the proportions is satisfactory. Table 16.9.1 shows data cited
by Bartlett (12) from an experiment in which root cuttings of plum trees
were planted as a means of propagation. The factors are length of cutting
(rows) and whether planting was done at on"e or in spring (columns).

TABLE 16.9.1
PEIlCENT AGfS Of SU1\ V!VING PLUM ROOT·stocKS FROM 240 Cunn"llGs

Lengtb of Time of Planting


Cutting At Once Spong

Long P" ~ 156/240 ~ 65.0"/, P" = 84/240 ~ 35.0"/,


V,' ~ 165.0)(35.0)/240 = 9.48 v" = (35.0)(65.0)/240 ~ 9.48
Short PZI = 107/240 ::::0 44.6% PH := 31/240 ",. 12.9%
'" = (44.6){55.4)/24O '" 10.30 vn = (12.9)(87.1)/240 = 4.68

In the (1, 1) cell, 156 plants survived out of 240, giving p" = 65.0%.
The estimated variances v for each p are also shown.
The analysis resembles that of section \6.5, the Plj replacing the XI)' •
To test for interaction we compute
4'16 Chap'" 16: rwo-way CI-mcGtions with Un"'lual Numi>ers 0fKi
Pll + PH - P12 - P21 = 65.0 + 12.9 - 35.0 - 44.6 = -1.7%
Its standard error is
,j(v 11 + v" + V 12 + V,,) = ,.133.94 = ±5.83
Since there is no indication of interaction, the calculation of row and col-
umn effects proceeds as in table 16.9.2. For the column difference in row
I, the variance is (VII + V12) = 18.96. The overall column difference is a
weighted mean of the differences in the two rows, weighted inversely as
the estimated variances. Both main effects are large relative to their stan-
dard errors. Clearly, long cuttings planted at once have the best survival
rate.

TABLE 16.9.2
CALCULAnON OF Row AND COLUMN EFFECTS

I At Once Spring D v W

Long Ip" ~ 65.0 VII = 9.48 Pil = 35.0 V12 = 9.48 30.0 18.96 0.0527
Short \ P21 = 44.6 Vl 1 = JO.3O Pu = 12.9 '7U= 4.68 31.7 14.98 0.0668

I D~20.4 V~19.78
W~ 0.0506
D~22.1 V= 14.16
W~ 0.0106
!
I
i
Main Effects:
At Once - Spring: I:WD/l:W ~ 31.0'10 S.E.= IIJ(I:W) = ±2.89
Long - Short I:WD/I: W - 21.4% S.E. = IIJ(I:W) = ±2:87

In Bartlett's original paper (12), these data were u",d to demon-


strate how to test for interaction in the logit scale. (He regarded the
data as a 2 x 2 x 2 contingency table and was testing the three-factor
interaction among the factors. alive-dead, long-short, at once-spring.)
However, the data show no sign of interaction either in the P or the logil
scale.
16.10-AnaJysis in the p seale: a 3 x 2 table_ In the second example,
. inspection of .the individual proportions indicates interactions that are
due to the nature of the factors and would not be removed by a logit
transformation The data are the proportions of children with emotional
problems in a study offamily medical care (13). classified by the number Of
children in the family and by Whether both, one. or no parents were re-
corded as having emotional problems, as shown in table 16.10.1.
In families having one or no parents with emotional problems the
four values of P are close to 0.3. any differences being easily accountable
by sampling errors. Thus there is no sign of an effect of number of children
or of the parents' status when neither or only one parent has emotional
problems. When both parents have problems there is a marked increase
in p in the smaller families to 0.579 and a modest increase in the larger
497

TABLE 16.10.1
hOPDRTION OF CHILDltEN WITH EMOTIONAL PkOSLEMS

Number of Children in Family


Parents With
Problems 1-2
Both p - 33/57 ~ 0.579 p ~ 15/38 ~ 0.395
One p - 18/54 _ 0.333 p - 17/55 - 0.309
None p - I 0/37 ~ 0.270 p - 9/32 - 0.281

familes to 0.395. Thus, inspection suggests that the proportion of chil-


dren with emotional problems is increased when both parents have prob-
lems, and that this increase is reduced in the larger families.
Consequently, the statistical analysis would probably involve little
more than tests of the differences (0.579 - 0.333), (0.395 - 0.309), and
(0.579 - 0.395), which require no new methods. The first difference is
significant at the 5% level but the other two are not, so that any conclusions
must be tentative. In data of this type nothing seems to be gained by
transformation to logits. Reference (13) presents additional data bearing
on the scientific issue.
16.1l-Analysis of Iogits in an R x C table. When the fitting of an
additive model in the logit scale is appropriate, the following procedure
should be an adequate approximation:
I. If p is a binomial proportion obtained from g successes out of n
trials in a typical ccli of the 2-way table. calculate the logit as
Y = In(g + 1/2)/(n - g + 1/2)}
10 each cell, where In denotes the log to baso e.
2. Assign to the logit a weight W = (g + 1/2)(n - g + 1/2)/(n + I).
In large samples, with all values of g and (n - g) exceeding 30, Y will be
essentially In (P/q) and the weight npq, which may be used if preferred. The
values suggested here for Yand Win small samples are based on research
by Gart and Zweifel (14). See example 16.12.3.
3. Then follow the method of fitting described for continuous data
ir_> section 16.7, with Y'j in place of Xjj' and with W,j in place of njj as
weights. Section 16.7 should be studied carefully.
4. The analysis of variance of Yis like table 16.7.2, but has no "With-
in sub-classes" line. If the additive model fits, the Imeractions sum of
squares is distributed approximately as X' with (r - l)(c - I) df A
significant value of X' is a sign that the model does not fit. This test should
be supplemented by inspection of the deviations Y'j - til to note any syste-
matic pattern that suggests that the model distorts the data.
5. If the model fits and the inverse multipliers eij have been computed
for the columns, the s.e. of any linear function of the column main effects is
,JC'£.L/CH + 2I.Lj L,cJI<I
498 Chapter 16: Two-way C/ouiflcatio". with Un.qual Numh_ and
In the numerical example which follows, the proportions p are all
small, the largest being 0.056. In this event, the logit of p is practically
the same as In (P). In effect, we are fitting an additive model to the loga-
rithms ofthep's, i.e., a multiplicative model to thep's themselves. Further,
with large samples the observed weight W = npq becomes W = np = g
when p is small, each logit being weighted by the numerator ofp.
16.11-Numerical example. The data come from a large study of the
relationship between smoking and death rates (15). Ahout 248,000 male
policyholders of U.S. Government Life Insurance answered questions
by mail about their smoking habits. The data examined here are for men
who reported themselves as non-smokers and for men who reported that
they smoked cigarettes only. The cigarette smokers are classified by num-
ber smoked per day, 1-9, 10--20,21-39, and over 39. For each smoking
class. the person-years of exposure were accumulated by JO-year age
classes, using actual ages. That is, a man aged 52 on entry into the study
would contribute 3 years in the 45-54 age class and additional years in
the 55-64 age class. Most men were in the study for 8j- years.
In table 16.12.1, part (A) shows for each cell the number of deaths.
Part (B) gives the annual probability of death (x 10 3 ) within each cell,
calculated from the number of deaths and the number of person-years of
exposure. Since the age distributions of different smoking cla..es were
not identical within a la-year age class, the probabilities were computed,
by standard actuarial methods, so as to remove any effect of these dif-
ferences in age-distributions.

TABLE 16.12.1
NUMBEllS 0fI D!Ants AND ANNUAL PROBAIIiLITIES OF DEATH (x 101)

Age Reported Num'ocr of Cigarettes Smok.ed. Per Day


(Yea...) None 1-9 10-20 21-39 Over 39

(A)_.ldeallu
35-44 41 1 90 83 10
4S.~54 38 11 61 80 14
5S-64 2,611 389 2,111 1,656 406
65-74 3,728 586 2,458 1,416 258

(8) antaMJl probabilities of death (')( 10l)


35-44 1.21 1.63 1.99 Z.66 3.26
45-54 2.64 6.23 6.64 S~91 11.60
5S-64 10.56 14.35 IS.50 20.S7 27.40
65-74 24.11 35.76 42.26 49.40 55.91

In every age group Ihe probability of death rises sharply wilh each
additional amount smoked. As expected, the probability also increases
consistently with age within every smoking class. II is of interesl to exam-
ine whether the rate of increase in probability of death for each additional
Proportion.
amount smoked is constant at the different ages or changes from age to age.
If the rate of increase is constant, this implies a simple multiplicative model
for row and column effects: apart from sampling errors, the probability
pijfor men in the ith age class andjth smoking class is of the form
Pij = a.dJj
In natural logs this gives the additive model
In (Pil) = In ~; + In PI
Before attempting to fit this model it may be well to compute for
each age group the ratio of the smoker to the non-smoker probabilities of
death (table 16.12.2) to see if the data seem to follow the model.

TABLE 16.12.2
RATIOS OF SMOKER TO NON-SMOKER PROBABILITIES Of DEATH

Reported Number Smoked Per Day

Ago 1-9 10-20 21-39 Over 39


35-44 1.28 1.57 2.09 2.57
45-54 2.36 2.51 3.37 4.39
55-64 1.36 1.75 1.98 2.59
65-74 1.48 1.75 2.05 2.32

The 11ltios agree fairly well for age groups 35-44, 55-64, and 65- 74,
but rUn substantially higher in age group 45-54. This comparison is an
imprecise one, however, since the probabilities that provide the denomi-
nators for the ages 35-44 and 45-54 are based on only 47 and 38 deaths.
respectively. A stabler comparison is given by finding in each row'the
simple average of the five probabilities and using this as denominator for
the row. This comparison (example 16.12.1) indicates that the non-
smoker probability of death may have been un~ually low in the age
group 45-54.
Omitting the multiplier 10 3 , the P values in table 16.12.1 range from
0.00127 to 0.05591. The assumption that these p's are binomial is not
strictly correct. Within an individual cell the probability of dying presum-
ably varies somewhat from man to man. This variation makes the vari-
ance of P for the cell less than the binomial va.riance (see example 16.12.2).
but with small p's the difference is likely to be negligible. Further, as
already mentioned, the p's were adjusted in order to remove any differ-
ence in age distribution within a IO-year class. Assuming the p 's binomial.
each In p is weighted by the observed number of deaths in the cell, as
pointed out at the end of the preceding seclion.
The model is
500 Chapl., 16: Two-way Classillcafionl with UnfMIU,,1 Numl>ers """
where the oij are independent with means zero and variances I/W,j' The
fitted model is
1'1} = m + a i + hj,
the parameters being chosen so as to minimize 1: W( Y - 1')'.

TABLE 16.12.3
ARRANGEMENT OF DATA FOR. FITTING AN ADDITIVE MODEL

Reported Number of Cigarettes Per Day


Age None 1-9 1()"20 21-39 Over 39
- -r-
35-44 WIj" 47 7 90 83 10 237 = WI'
Yl)t 0.239 0.489 0.688 0.978 1.182 169.570 ~ YI .
45-54 W1j 38 II 67 80 14 210
Ylj 0.971 1.829 1.893 2.187 2.451 393.122
55--64 W" 2.617 389 2.117 1.656 406 7.185
Y" 2.357 2.664 2.918 3.038 3.310 19.756.759
65-74 W"'j 3,728 586 2.458 1.416 258 8.446
y....i 3.183 3.577 3.744 3.900 4.024 29.725.690

W.] 6,430 993 4,732 3,235 688 16,078 ~ w..


Y j 2.812 3.178 3.290 3.341 3.529 5O,045.141~ Y.
3.1126 ~ Y .

• W lj = cell weig'" "'" number or deaths.


t Y1j = In(Plj)'

Table 16.12.3 shows the weights Wi} number of deaths and the =
YIj = In (Pij)' The first step is to find the row and column totals of the
weights, and the weighted row and column tOlals of the Y ij , namely

w.. =
.
L w.J:
j
= L w.J:
w.j
i
Y,. = L w.jY,j: Y. j = Li w.JY,j:
W .. = LW.. : Y.. = L Y,.
i " I
If we make the usual restrictions,

L K-;·Qi = L W.Jb j = 0,
J
then m is the overall mean Y. .jW. = Y. = 3.1126. Analogous to (16.7.1)
and (16.7.2), the normal equations for Qi and bj are
Wi·(m + ail + W"b l + W"b, + ... Wkb, = 1';. (16.12.1)
W.J(m + bl ) + W,jO, + W'ja, + ... + W,jb, = Y' j (16.12.2)

Since we are not interested in attaching standard errors to the a, or


hj, these equations will be solved directlyby successive approximations.
As first approximations to the quantities(m + bi! we use the observed
501

column means Y. j = Y. )/W. j. shown in table 16.12.3. Rewriting equation


(16.12.1) in the form
W;.(m + a,) = Y;. + W;. Y.. - W;l(m + bl ) - ... - w,,(m + b,)
we obtain second approximations to the (m + a,). For row 1,
237(m + a l ) = 169.570 + (237)(3.1126) - (47)(2.S12) - ... - (10)(3.529)
(m + al ) = 144.153/237 = 0.6OS

These are then inserted in (16.12.2) in the form


W.j(m + hj ) = Y' j + w.j Y.. - Wlim + al) - ... - W.tm + a.)
and so on. The estimates settle down quickly. After three rounds the
following estimates were obtained:

Ages 35-44 4S-S4 6S-74

0.5748 1.7130 2.7193 3.5538

No. per Day None 1-9 10-20 21-39 Over 39


m + hj 2.7433 3.1053 3.3052 3.4492 3.6612
---------------------------
As a check, at each stage the quanlllles 1:W,.(m + a,) and
1:w.;(m + bj ) should agree with the grand total Y.. to within rounding
errors.
The expected value in each cell is conveniently computed as
f,) = (m + a;) + (m + b) - Y.. ,~
Table 16.1'2.4 shows the observed and expected values and the devia-
tions. The value of X' = 1: WIj( Y'J - f,;)' is 13.2 witb 12 df, giving no
indication of a lack of fit. The largest deviation is the deficit -0.373 for
non-smokers aged 45-54: this deviation also makes the largest contribu-
tion to X'. The pattern of + and - signs inlhe deviations bas no striking
features.
By finding the antilogs of the quantities (b j - b l ). the ratios of the
smoker to the non-smoker annual probabilities of death as given by this
model are obtained. These ratios were 1.44, 1.75, 2.03, and 2.50, respec-
tively, for smokers of 1-9, 10-20,21-39, and over 39 _cigarettes per day.
An example of the analysis of a proportion in a 2' factorial classifica-
tion with only main effects important is given by Yates (16) using the logit
scale and observed weights. Dyke and Patterson (17) give the maximum
likelihood analysis of the same data. These authors define the logit as
tln(p/q).
Data containing a proportion in an R x C table may be regarded as
an R x ex 2 contingency table, or as a particular case of an R l( exT
contingency table. The definition and testing of three-factor interactions
502 CIxrpI.... 16: T__ way elauillcatioM willi UftQquol Nu"""," """
TABLE 16.12.4
O&sEkvED AND EXPECTED NUMBERS OF In p.

Reported Number of Cigarettes Per Day

Age None 1-9 1()-20 21-39 Over 39

35-44 Yjj 0.239 0.489 0.688 0.918 1.182


fij 0.2Q6 0.568 0.161 0.911 1.123
Dij +0.033 -0.019 -0.019 +0.061 +0.059

45-54 0.971 1.829 1.893 2.181 2.451


1.344 1.106 1.906 2.050 2.262
-0.373 +0.123 -0.013 +0.131 +0.189

55-64 2.351 2.664 ·2.918 3.038 3.310


2.350 2.112 2.912 3.056 3.268
+0.001 -0.048 +0.006 -0.018 +0.042

65-14 3.183 3.511 3.144 3.900 4.024


3.184 3.546 3.146 3.890 4.102
-0.001 +0.031 -0.002 +0.010 -0.018

in such tabl~ has attracted much attention in recent years: Goodman


(18) gives a review and some simple computing methods.
EXAMPLE 16.12.1-In each row of table 16.12.1 find the unweighted average of the
prgbabilities and divide the individual probabilities by this number. Show that the results
are as follows:

Age None 1-9 1()-20 21-39 Over 39

35-44 .59 .15 .92 1.23 l.51


45-54 .31 .86 .92 1.24 1.61
55-64 .58 .78 1.01 1.14 1.49
65-14 .58 , .86 1.02 1.19 1.35

The two numbers that seem most out of line are the low value 0.37 for (None. 45-54)
and the low value 1.3'5 for (over 39, 65-74).
EXAMPLE 16. 12.2-Supposc: that there are threegToups of II men, with probabilities
of dying 0.01. 0.02. and 0.03. The variance of the total number who die is
"(.01)(.99) + (.02)(.98) + (.03)(.91)] ~ 0.0586n
Hence, the v~riance of the proportion of those dying out of 3n is .0586n/9ffl = 0.006511/".
For the combmed sample. the probability of dying is 0.02. If we wrongly regard the com-
bined sample as a s.ingie binomial of size 3n with p = 0.02. we would compute the variance
of the proportion dying as (0.02)(O.98)!3n = O.0065B . n. The actual variance is just a trifte
smaller than the binomial variance.
If there are k groups of men with probabilities PI' Pl' ... Pk' show that the relation be-
tween tbe actual and the binomial variance of the overall proportion dyiog is
V.n = V.i. - I:(pj - p)1(nk2
EXAMPLE 16.12.3-10 a sample of size II with population probability p the true logit
is -'11/). The value. Y =z In{(g + })((n - g + i)} is a relatively unbiased estimate of
In(p/q) for expectations np and nq as low as 3. The \Veight W., (g + iKn - g + i)/(n + I)
corresponds to a vanance
I I I
V = IV = g-+-t +.CC-gc-+-:-rt
The quantity V is an almost unbiased estimate of the population variance 0( Y in small
Si!mples. As an illustration the values of the binomial probability P. and of Yand V arc
shown below for each value of gwhen n = lO.p = 0.3.

g p y V Y'
0 .0212 -3.046 2.095 9.278
I .1211 -1.846 0.772 3.408
2 .2335 -1.224 0.518 1.498
3 .2668 -0.762 00419 0.581
4 .2001 -0.367 0.376 0.135
5 .1029 0.000 0.364 0.000
6 .0368 0.:;\67 0.376 0.135
7 .0090 0.762 0.419 0.581
8 .0014 1.224 0.518 1A98
9 .0001 1.846 0.772 3.408
10 .0000 3.046 2.095 9.278

The true logit is In(0.3)0.7) = -0.8473. Verify that (j) the mean value of Y is -0.8497.
Iii) the variance of Y is 0.4968. (iii) the mean value of Vis 0.5164. about 4~;) too large.

REFERENCES
I. J. W. GoWEN. Amer. J. Hum. Gene' .. 4:285 (1952).
2. A. E. BRANDT Ph D. Thesis. Iowa State College (1932).
J. M. B. WILK and O. KEMPTHORNE. WADC Technical Report 55-244. Vol. II. Office
of Technical Ser~·ices. U.S. Dept. of Commerce, Washington. D.C. ( 1956).
4. N. STRANO and R. J. JESSEN. Iowa Agric. Exp. SU. R~. Sui. 315 (1943).
5. G. W. SNEDKQR and W. R. BItENiEMAN. Iowa State College J. Sci .. 19: J3J (19451
6. B. BROW:-I. Proc. Jowa Ac·ad. St'i., 38:205 (1932).
7. F. YATES. J.Amer. Sial. Ass .• 29::51 (934).
8. W. L STEVENS. Biometrika, 3:5: 346 (1948).
9. W. T. FEDERER and M. ZELEN. Bimnelrio. 22.: 525 (1966).
10. E. R. BECKER and P. R. HALL. ParasitoloRv. 25:·397 (1933).
II. W. G. COCHRA.N. AIm. Math. Stali.fl .. II: 335 (1940).
12. M. S. BARTLETT. J R Stalist. Sm. Supp .• 2:248 (1935).
13. G. A. SI1.VER. Family Medicol CaTl'. Harvard University Press. Cambridge. Mass.
Table 59 (1963).
14. J. J.GA.uandJ. R. ZWt'lf[l. Bi()m~/rlkn. 54:~J~(967).
15. H. A. KAHN. Nat. Cane. 'n~t. Monograph,19:f (19661.
lb. J-. YAHS. Sampling MelhQd\ Jor eM.tlm'S ami SSIFIJl'y.r. Ch.arj~ Griffin. London.
3rd ed .. Section Y.7 (19Nlt
17 G.V,DYKF.andH.D.P"llr.}l.~!" Bltlmt'frics.M.1119SZ,.
I~. LA. GOODMAS. J Amt'I" .\fafi.H A.H. 59,319 j I~I.
* CHAPTER SEVENTEI!N

Design and analysis


of sampling

17.1-Popubttioas. In the 1908 paper in which he discovered the


Hest, "Student" opened with the following words: "Any experiment may
be regarded as forming an individual of a population of experiments which
might be performed under the same conditions. A series of experiments
is a sample drawn from this population.
"Now any series of experiments is only of value in so far as it enables
us to form a judgment as to the statistical constants of the population to
which the experiments belong."
From the previous chapters in this book, this way of looking at data
should now be familiar. The data obtained in an experiment are subject
to variation, so that an estimate made from the data is also subject to varia-
tion and is, hence, to some degree uncertain. You can visualize, howevef l
that if you could repeat the experiment many times, putting all the results
together, the estimate would ultimately settle down to some unchanging
value which may be cailed the true or definitive result of the experiment.
The purpose of the siatisi.kai anaiysis of an experiment is to reveal what
the data can tell about this true result. The tests of significance and
confidence limits which have'appeared throughout this book are tools for
making statements about the population of experiments of which your
data are a sample.
In such problems the sample is concrete, but the population may
appear somewhat hypothetical. It is the population of experiments that
might be performed, under the same conditions, if you possessed the
necessary resources, time, and interest.
In this chapter we turn to situations in which the population is con-
crete and definite, and the problem is to obtain some desired information
about it. Examples are as follows:
Population Information Wanted
Ears of corn in a field Average moisture content
Seeds in a large batch Percentage germination
Water in a,reservoir Concentration of certain bacteria
Third-grade children in a school Average weight
504
505
If the population is small, it is sometimes convenient to obtain the
information by collecting the data for the whole of the population. More
frequently, time and money can be saved by measuring oRly a sample
drawn from the population. When the measurement is destructive, sam-
pling is of course unavoidable.
This chapter presents some methods for selecting a sample and for
estimating population characteristics from the data obtained in the
sample. During the past thirty years, sampling has come [0 be relied upon
by a great variety of agencies, including government bureaus, market
research organizations, and public opinion poils. Concurrently, much has
been learned both about the theory and practice of sampling, and a num-
ber of books devoted to sample survey methods have appeared (2, 3, 4, 5,
13). In this chapter we explain the general principles of sampling and
show how to handle some of the simpler problems that are common in
biological work. For more complex problems, references will be given.
17.2-A simple example. In the early chapters of this book, you drew
samples so as to examine the amount of variation in results from one
sample to another and to verify some important results in statistical
theory. The same method will illustrate modern ideas about the selection
of samples from given popUlations.
Suppose the population consists of N = 6 members, denoted by the
I~tters a to f. The six values of the quantity that is being measured are as
follows; a I; b 2; c 4; d 6; e 7; f 16. The total for this population is 36.
A sample of three members is to be drawn in order to estimate this total.
One procedure already familiar to you is to write the letters a to f on
beans or slips of paper, mix them in some container, and draw out three
letters. In sample survey work, this method of drawing is called simple
random sampling, or sometimes random sampling without replacement (be-
cause we do not put a leller back in the receptacle after it has been drawn).
Obviously, simple random sampling gives every member an equal chance
of being in the sample. It may be shown that till! method also gives every
combination of three different letters (e.g., aef Or ede) an equal chance of
constituting the sample.
How good an estimate of the popUlation total do we obtain by simple
random sampling" We are not quite ready to answer this question. Al-
though we know how the sample is to be drawn, we have not yet discussed
how the population total is to be estimated from the results of the sample.
Since the sample contains three members and the population contains six
members, the simplest procedure is to multiply the sample total by 2, and
this is the procedure that will be adopted. You should note that any sam-
pling plan contains two parts--a rule for drawing the sample and a rule for
making the estimates from the Tesulls of the sample.
We can now write down all possible samples of size 3, make the esti·
mate from each sample, and see how close these estimates lie to the true
value of 36. There are 20 different samples. Their results appear in tahle
17.2.1. where the successi\'e columns show the composition of the sample,
506 Chap'.r 17: Design cmJ A.olysi. o( Sampling
the sample total, the estimated population total, and the error of estimate
(estimate minus true value).
Some samples, e.g., obi and cde, do very well, while others like abc
give poor estimates. Since we do not know in any individual instance
whether we will be lucky or unlucky in the choice of a sample, we appraise
any sampling plan by looking at its average performance.
TABLE 17.2.1
REsULTS FOR ALL POSSIBLE SIMPLE RANDOM SAMPLES OF SIZE THllEE

Estimate of Estimate of
Sample Population Error of Sample Population Erroraf
Sample Total Total Estimate Sample Total Total Estimate

abc 7 14 -22 bed 12 24 -12


abd 9 18 -18 bee 13 26 -10
abe 10 20 -16 bel 22 44 + 8
abl 19 38 + 2 bde 15 30 - 6
aed II 22 -14 bq[ 24 48 +12
ace 12 24 -12 bel 25 50 +14
ael 21 42 + 6 ede 17 34 - 2
ode 14 28 - 8 edl 26 52 +16
adl 23 46 +10 eel 27 54 +18
ael 24 48 +12 del 29 58 +22
.-
,I Average 18 36 0

The average of the errors of estimate, taking account of their signs, is


called the bias of the estimate (or, more generally, of the sampling plan). A
positive bias implies that the sampling plan gives estimates that are on the
whole too high; a negative bias, too low. From table 17.2.1 it is evident
that this plan gives unbiased estimates, since the average ofthe 20 estimates
is exactly 36 and consequently the errors of estimate add to zero. With
simple random sampling this result holds for any population and any
size of sample. Estimates that are unbiased are a desirable feature of a
sampling plan. On the other hand. a plan that gives a small bias is not
ruled out of consideration if it has other attractive features.
As a measure of the accuracy of the sampling plan we use the mean
square error of the estimates taken about the true population value.
This is
l:(Error of estimate)2 3,504
M.S.E. = 20 = 20 = 175.2

The divisor 20 is used instead of the divisor 19. because the errors are mea-
sured from the true population value. To sum up. this plan gives an esti-
mate of the population tOlal that is unbiased and has a standard error
-./ 175.2 = 13.2. This standard error amounts to 37~'; of the true. popula-
tion total; evidently the plan IS not very accurate for thIS population.
507
In simple random sampling the selection of the sample is left to the
luck of the draw. No use is made of any knowledge that we possess about
the members of the population. Given such knowledge. we should be
able to improve upon simple random sampling by using the knowledge to
guide us in the selection ofthc sample. Much of the research on sample
survey methods has been dir~'led towards taking advantage of available
information about the population to be sampled.
By way of illustration suppose that before planning the sample we
expect thatfwill give a much higher value than any other member in the
population. How can we use this information'.' It is clear that the esti-
mate from the sample will depend to a considerable extent on whether f
falls in the sample or not. This statement can be verified from table
17.2.1 : every sample containingf gives an overestimate and every sample
without f gives an underestimate.
The best plan is to make sllre that I appears in every sample. We
can do this by dividIng the population into two parts or "Irala. Stratum I,
which consists oflalone. is completely measured. In stratum II. contain-
ing a, h, c, d. and e, we take a simple random sample of size 2 in order to
keep the total sample size equal to 3.
Some forethought is needed in deciding how to estimate the popula-
tion total. To use twice the sample total. as was done previously. gives
too much weight to f and, as already pointed out. will always produce an
overestimate of the true total. We can handle this problem by treating
the two strata separately. For stratum 1 we know the total (16) correctly,
since we alwa)s measure f. For stratum II. where ~ members are mea~
sured out of 5, the natural procedure is to mUltiply the sample total in that
stratum hy 5 2, or 2.S.· Hence the appropriate estimate of the population
total is

16 + 2~.5 (Sample total 10 stratum 11)

These estimates are shown for the III possible samples in table 17 .2.2.
Again we note that the estimate is unbias~d. Its mean square error is
'-

~ (Error of estimate)' 487.50


10 =-1-0-= 48.75

The standard error is 7.0 or 19"" of the true total. This is a marked im-
provement over the standard error of 13.2 that was obtained with simple
random sampling.
This sampling plan goes by the name of .. rrutified random .<amp/ing
with unequol .'iamplin~frt1('1if)n.'i. The last part of the title denotes the fact
that stratum 1 is completely sampled. whereas stratum II is sampled at a
rate of 2 units out of 5, or 40%. Stratification allows us to divide the
population into sub-populations or strata that are less \·ariable than the
SOl Chapt.r 17: D.';"" artcl Anal"';' 01 Sampling
TABLE 17.2.2
REsULTS FOR ALL POssiBLE STRATIfiED RANOOM SAMPLES WITH THE UNEQUAL
SAMPLING FRACTIONS DESCRIBED IN TEXT

Sample Total in Estimate Error of


Sample Stratum II (Tz) 16+2.5T, Estimate

obj 3 23.5 -12.5


oej 5 28.5 - 7.5
ad! 7 33.5 - 2.5
aej 8 36.0 0.0
bel 6 31.0 - 5.0
bdl 8 36.0 0.0
bel 9 38.5 + 2.5
edl 10 41.0 + 5.0
eel II 43.5 + 7.5
del 13 485 + 12.,5

Average 36.0 0.0

original population, and to sample different parts of the population at dif-


ferent rates when this seems advisable. It is discussed more fully in sec-
tions 17.8 and 17.9.
EXAMPLE 17.2.1-10 the preceding example, Suppose you expect that both e and f
will give high values, You decide that the sample shall consist of e,f, and one meJl1ber drawn
at random from a, h, c, d. Show how to obtain an unbiased estimate offhe po pula lion tora!
and show that the standard error of this estimate is 7.7. (This sampling ptan i!; not as. ac-
curate as the plan in Which/alone was placed in a separate stratum. because the actual valul;!
for f' is not very high.)
EXAMPLE 17.2.2-If previous information suggests that f will be high. d and (!
moderate, and a, b, and c small, we might try stratified sampling with three strata. The
sample consists off, either d or e, and one chosen from a, b. and c. Work out the unbiased
estimate of the population total for each of the sill: possible samples and show that its Stan·
dard error is 3.9. ,
17.3-Probability sampling. The preceding examples were intended
to introduce you to probability sampling. This is a general name given
to sampling plans in whiCh
(i) every member of the population has a known probability of being
included in the sample,
(ii) the sample is drawn by some method of random selection con-
sistent with these probabilities,
(iii) We take account of these probabilities of selection in making the
estimates from the sample.
Note that the probability of selection need not be equal for all mem-
bers orthe population: it is sufficient that these probabilities be known. In
the first example in the previous section, each member of the popUlation
had an equal chance of being in the sample, and each member of the sample
received an equal weight in estimating the population total. But in the
second example, member f was given a probability I of appearing in the
sample, as against 2/5 for the rest of the popUlation. This inequality in
509
the probabilities of selection was compensated for by assigning a weight
5/2 to these other members when making the estimate. The use of un-
equal probabilities produces a substantial gain in precision for some types
of populations (see section 17.9).
Probability sampling has several advantages. By probability theory
it is possible to study the biases and the standard errors of the estimates
from different sampling plans. In this way much has been learned about
the scope, advantages, and limitations of each plan. This information
helps greatly in selecting a suitable plan for a particular sampling job. As
will be seen later, most probability sampling plans also enable the stan-
dard error of the estimate, and confidence limits for the true population
value, to be computed from the results of the sample. Thus, when a
probability sample has been taken, we have some idea as to how accurate
the estimates are.
Probability sampling is by no means the only way of selecting a sam-
ple. An alternative method is to ask someone who has studied the popu-
lation to point out "average" or "typical" members, and then confine the
sample to these members. When the population is highly variable and
the sample is small, this method often gives more accurate estimates than
probability sampling. Another method is to restrict the sampling to those
members that are conveniently accessible. If bales of goods are stacked
tightly in a warehouse, it is difficult to get at the inside bales of the pile
and one is tempted to confine attention to the outside bales. In many
biological problems it is hard to see how a workable probability sample
can be devised, as in estimating, for instance, the number of house flies
in a town, or of field mice in a wood, or of plankton in the ocean.
One drawback of these alternative methods is that when the sample
has been obtained, there is no way of knowing hqw accurate the estimate is.
Members of the population picked out as typical by an expert may'be
more or less atypical. Outside bales mayor may not be similar to interior
bales. Probability sampling formulas for the standard error of the esti-
mate or for confidence limits do not apply to these methods. Conse-
quently, it is wise to use probability sampling ullIe:;s there is a clear case
that this ;s not feasible or is prohibitively expensive.'
17,4-Listing the population, In order to apply probability sampling,
we must have some way of subdividing the population into units, called
sampling unils, which form the basis for the selection of the sample. The
sampling units must be distinct and non-overlapping, and they must to-
gether constitute the whole of the popUlation. Further, in order to make
some kind of random selectIOn of sampling units, we must be able to
number or lisr all the units. As will be seen, we need not always write
down the complete list but we must be in a position to construct it. Lis.ting
is easily accomplished when the popUlation consists of 5/>00 cards nt-at!y
arranged in a fiie. or 300 ears of corn lying on a bench, or the trees in a
small orchard. But the subdivision of a popUlation IOto sampling units
that can be listed sometimes presents a difficult practical problem.
510 Chapter 17: D..ign and Analysis of Sompling
Although we have spoken of the populatIon as being concrete and
definite, there may be some vagueness about the population which does
not become apparent until a sampling is being planned. Before we can
come to grips with a population of farms or of 'nursing homes, we must
define a farm or a nursing home. The definition may require much study
and the final decision may have to be partly arbitrary. Two principles to
keep in mind are that the definition should be appropriate to the purpose
of the sampling and that it should be usable in the field (i.e., the person
collecting the information should be able to tell what is in and what is out
of the population as defined).
Sometimes the available listings of farms, creameries. or nursing
homes are deficient. The list may be out of date, having some members
that no longer belong to our population and omitting some that do belong.
The list may be based on a definition different from that which we wish to
use for our population. These points should be carefully checked before'
using any list. It often pays to spend considerable effort in revising a list to
make it complete and satisfactory, since this may be more economical than
constructing a new list. Where a list covers only part of the population,
one procedure is to sample this part by means of the list, and to construct
a separate method of sampling for the unlisted part of the population.
Stratified sampling is useful in this situation: all listed members are as-
signed to one stratum and unlisted members to another.
Preparing a list where none is available may require ingenuity and
hard work. To cite an easy example, suppose that we wish to take a num-
ber of crop samples, each 2 ft. x 2 ft., from a plot 200 ft. x 100ft. Divide
the length of the plot into 100 sections, each 2 ft .. and the breadth into
50 sections, each 2 ft. We thus set up a coordinate system that divides
the whole plot into 100 x 50 or 5,000 quadrats. each 2 ft. x 2 ft. To select
a quadrat by simp1e random sampling, we draw a random number be-
tween I and J00 and another random number between I and 50. These
coordinates locate the corner of the quadrat that is farthest from the origin
of our system. However, the problem becomes harder if the plot measures
163 ft. x 100 ft., and much harder if we have an irregularly shaped field.
Further, if we have to select a number of areas each 6 in. x 6 in. from a
large field, giving every area an equal chance of selection. the time spent
\l\ se\ec\\l\g and \oca\\n~ \b~ sam\)\e aTea~ become:5. ~u\Y.;\antia\. 'Part\~ tOT
thi' reason. methods of ;ystematic sampling (section 17.7) have come to
be favored in routine soil sampling (8).
Another illustration is a method for sampling (for botanical or chemi-
cal analysis) the produce of a small plot that IS already cut and bulked.
The bulk is separated into two parts and a coin is tossed (or a random
number drawn) to decide which part shall contain the sample. ThIS part
is then separated into two_ and the process continues until a sample of
about the desired size is obtained. At any stage it is good practice to make
the two parts as alike as possible, provided this is done before the coin is
tossed. A quicker method. of course, is to grab a handful of about the
III
desired size; this is sometimes satisfactory but sometimes proves to be
biased.
In urban sampling in the United States, the city block is often used as
a sampling unit, a listing of the blocks being made from a map of the town.
For extensive rural sampling, county maps have been divided into areas
with boundaries that can be identified in the field and certain of these
areas are selected to constitute the sample. The name area sampling has
come to be associated with these and other methods in which the sampling
unit is an area of land. Frequently the principal advantage of area sam-
pling, although not the only one, is that it solves the problem of providing
a listing of the population by sampling units.
In many sampling problems there is more than one type or size of
sampling unit into which the population can be divided. For instance, in
soil sampling in which borings are taken, the size and shape of the borer
can be chosen by the sampler. The same is true of the frame used to mark
out the area of land that is cut in crop sampling. In a dental survey of the
fifth-grade school children in a city, we might regard the child as the
sampling unit and select a sample of children from the combined school
regis~ers for the city, It would be administratively simpler, however, to
take the school as the sampling unit, drawing a sample of schools and
examining every fifth-grade child in the selected schools, This approach,
in which the sampling unit consists of some natural group (the school)
formed from the smaller units in which we are interested (the children).
goes by the name of cluster sampling,
If you are faced with a choice between different sampling units, the
guiding rule is to try to select the one that returns the greatest precision
for the available resources. For a fixed size of sample (e.g" 5% of the
population), a large sampling unit usually gives less accurate results than
a small unit, although there are exceptions. To counterbalance this, it is
generally cheaper and easier to take a 5% sample with a large sampling
unit than with a small one, A thoroush comparison between two units is
likely to r"ctuire a ~pecial investigation, in which both ~ampli'ng ",ron. and
costs (or times required) are computed for each unit.
17,S-Simple random sampling, In this and later sections, some of
the best-known methods for selecting a pr"Obability sample will be pre-
sented. The goal is to use a sampling plan that gives the highest precision
for the resources to be expended, or, equivalently, that attains a desired
degree of precision with the minimum expenditure of resources, [t is
worthwhile to become familiar with the principal plans, since they are
designed to take advantage of any information that you.have about the
structure of the population and about the costs of taking the sample.
In section 17.2 you have already been introduced to simple random
sampling, This is a method in which the members of the sample are drawn
independently with equal probabilities, [n order to illustrate the use of a
table of random numbers for drawing a random sample, suppose that the
population contains N = 372 members and tbat a sample of size n = to
!o\'l C'-'" \1, !). .\~ "".I ""01"", QI Sorn.,'ift9
is wanted. Select a three-digit starting number from table A I, say the
number is 539 in row II of columns 8()"'82. Read down the column and
pickout the first ten three-digit numbers that do not exceed 372. These are
334,365,222,345.245,272,075,038,127, and 112. The sample consists of
the sampling units that carry these numbers in your listing of the popula-
tion. If any number appears more than once, ignore it on subsequent
appearances and proceed until ten different numbers have been found.
If the first digit in N is 1,2, or 3, this method requires you to skip many
numbers in the table because they are too large. (In the above example
we had to cover 27 numbers in order to find ten for the sample,) This does
not matter if there are plenty of random numbers. An alternative is to use
all three-digit numbers up to 2 x 372 = 744. Starting at the same place,
the first ten numbers that do not exceed 744 are 539, 334,615,736,365,
222,345,660,431, and 427, Now subtract 372 from all numbers larger
than 372. This gives, for the sample, 167, 334, 243, 364, 365, 222, 345,
288,59, and 55. With N = 189, for instance, we can use all numbers up to
5 x 189 = 945 by this device, subtracting 189 or 378 or 567 or 756 as the
case may be.
As mentioned previously, simple random sampling leaves the selee-
lion of the sample entirely to chance. It is often a satisfactory method
when the population is not highly variable and, in particular, when esti-
mating proportions that are likely to lie between 20% and 80%. On the
other hand, if you have any knowledge of the variability-in the population,
such as that certain segments of it are likely to give higher responses than
others, one of the methods to be described later may be more precise.
If: Y; (i = I, 2, ... N) denotes the variable that is being studied, the
standard deviation, (1, of the population is defined as

. (J=J~(Y;-Y)",
" N - I
where Y is the popUlation mean of the Y, and the sum :l: is taken over all
sampling units in the popUlation.
Since Y denotes the population mean, we shall use y to denote the
sample mean. In a simple random sample of size n, the standard error
of y is:

where ¢ = nl N is the sampling fraction, i.e., the fraction of the population


that is included in the sample. The sampling fraction is commonly de-
noted by the symbol f, but ¢ is used here to avoid confusion with our pre-
vious use of ffor degrees of freedom,)
The term (l/.Jn is already familiar to you: this is the usual formula for
the standard error of a sample mean. The second factor, ../(1 - ¢), is
513
known as the finite population correction. It enters because we are sam-
pling from a population of finite size, N, instead offrom an infinite popula-
tion as is assumed in the usual theory. Note that this term makes the stan-
dard error zero when n = N, as it should do, since we have then measured
every unit in the population. In practical applications the finite popula-
tion correction is close to I and can be omitted when n/N is less than 10%,
i.e., when the sample includes less than 10% of the population.
This result is remarkable. In a large papulation with a fixed amount
of variability (a given value of a), the standard error of the mean depends
mainly on the size of sample and only to a minor extent on the fraction of
the population that is sampled. For given a, the mean of a sample of 100 is
almost as precise when the population size is 200,000 as when the popula-
tion size is 20,000 or 2,000. Intuitively, some people feel that one cannot
possibly get accurate results from a sample of 100 out of a population of
200,000, because only a tiny fraction of the population has been measured.
Actually, whether the sampling plan is accurate or not depends primarily
on the size of a/Jn. This shows why sampling can bring about a great
reduction in the amount of measurement needed.
For the estimated standard error of the sample mean we have
S
-J-n y(l
I

. =
S, - 4»,
where s is the standard deviation of the sample, calculated in the usual way.
If the sample is used to estimate the population total of the variable
under study, the estimate is Ny and its estimated standard error is .
Ns
In y(1
I

ss, = - 4»
In simple random sampling for attributes, where every member of the
sample is classified into one of two classes, we take
ri<i
sp =..J n-J(I - 4»
where p is the proportion of the sample that lies in one of the classes. Sup-
pose that 50 families are picked at random from a list of 432 families who
possess telephones and that 10 of the families report that they are listening
to a certain radio program. Then p = 0.2, q = 0.8 and

sp = (0.2)(0.8) 1(1 _ ~) = 0.053


50" 432
If we ignore the finite population correction. we find sp = 0.057,
The formula for sp holds only if each sampling unit is classified as a
whole into one of the 1.'0 classes. If you are using cluster sampling and are
classifying individual elements within each cluster, a different formula for
514 Chapt.r Il: DesillO and Anal"';. o( Samp/ioll
s, must be used. For instance, in estimating the percentage of diseased
plants in a field from a sample of 360 plants, the formula above holds if
the plants were selected independently and at random. To save time in
the field, however, we might have chosen 40 areas, each consisting of 3
plants in each of 3 neighboring rowS. With this method the area (a clus-
ter of 9 plants) is the sampling unit. If the distribution of disease in the
field were extremely patchy, it might happen that every area had either
all 9 plants diseased or no plants diseased. In this event the sample of 40
areas would be no more precise than a sample of 40 independently chosen
plants, and we would be deceiving ourselves badly if we thought that we
had a binomial sample of 360 plants.
The correct procedure for computing s, is simple. Calculate p sepa-
rately for each area (or sampling unit) and apply to these p'. the previous
formula for continuous variates. That is, if Pi is the percentage diseased
in the ith area, the sample standard deviation IS

(Pi - p)'
s= (n - 1) ,

where n is now the number of areas (cluster units). Then


s
sp = "In ,,1(1 - t/J)

F or instance, suppose that the numbers of diseased plants in the 40 areas


were as given in table 17.5.1.

TABLE 17.5.1
NUMBERS OF DISEASED PLANTS (OUT OF 9) IN EACH OF 40 AREAS

25111700323000 7 '04126
o 'b .I 4 5 0 1 4 2 6 0 2 4 1 7 3 S 0 3 6
Grand total = 99

The standard deviation of the numbers of diseased plants in this sample is


2.331. Since the proportions of diseased plants in the 40 areas are found by
dividing the numbers in table 17.5.1 by 9, the standard devlauon of the
proportions is

s= 2.~31 = 0.259

Hence (assuming N large).


s 0.259
s = ~ = -- = 0.041
P .J1l ,,140
515
For comparison, the result given by the binomial formula will be
worked out. From the total in table 17.5.1, p = 99/360 = 0.275. The
binomial formula is

sp
= -v(Pq
36i.i
= )(0.275)(0.725) = 0024
360 .,

giving an overly optimistic notion of the precision of p.


Frequently, the clusters are not all of the same size. This happens
when the sampling units are areas of land that contain different numbers
of the plants that are being classified. Let mi be the number of elements
that are classified in the ith unit, and ai the number that fall into a speci-
fied class, so that Pi = admi. Then P, the overall proportion in the sam-
ple is (l:ai)/(l:mi)' where each sum is taken over the n cluster ·units.
The formula for s, the standard deviation of the individual propor-
tions Pi uses a weighted mean square of the deviations (Pi - p), as follows:

where m= l:mdn is the average size of cluster in the sample. This formula
is an approximation. no correct expression for s being known in usable
form. As before. we have
s
s = - 1(1 - <1»
P ,jn Y

For computing purposes, s is better expressed as

S = m (n _I I) {1:a/ - 2pI.a i rn i + p 2 I:m/}

The sums of squares l:a i2, l:mi 2 and the sum of products l:aimi are cal-
culated without the usual corrections for the mean. The same value of s
is obtained whether the corrections for the mean are applied or not, but
it saves time not to apply them.
EXAMPLE 17.5.1--.Jf a samph: of 4 from the 16 townships of a county has a standard
deviation 45. show that the standard error of the mean is /9.5.
EXAMPLE 17.5.2 --In the example presented in section 17.2 we had N = 6, II == \
and the values for the 6 members of the population were I. 2, 4. 6. 7. and 16. The fonnula
for the true standard error of the estimated. population total is

Verify that this formula asrees with the result, 13.2. which we found by writin, down aU
possible samples.
516 Chople, 17: Design end Anolysis "f Sampling
EXAMPLE 17.5.3-A simple random sample of size 100 is taken-in order to estimate
some proportion (e.g .. the proportion of males) whose value in the population is close to 1/2.
Work out the standard error of the sample proportion p when the size of the population is
(i) 200. (ii) 500. (Iii) 1.000. (iv) 10,000. (v) 100.000. Note how little the standard efror changes
for N greater lhan 1.000.
EXAMPLE 17.S.4---Show that the coefficient of variation of the sample mean is the
same as tha! of the estimated population total.
EXAMPLE 17.5.5~ln simple random sampling'for attributes, show that the standard
error of p. for given Nand fI, is greatest when p is 50~'~, but that the coefficient of variation of
P IS largest when p is very small.

17.6-Size of sample. At an early stage in the design of a sample. the


question "How large a sample do I need?" must be considered. Although
a precise answer may not be easy to find, for reasons that will appear,
there is a rational method of attack on the problem.
Clearly. we want to avoid making the sample so small that the esti-
mate is too inaccurate to be useful. Equally, we want to avoid taking a
sample th.at is too large, in that the estimate is more accurate than we re-
quire. Consequently, the first step is to decide how large an error we
can tolerate in the estimate. This demands careful thinking about the
use to be made of the estimate and about the consequences of a sizeable
error. The figure finally reached may be to some extent arbitrary, yet
after some thought samplers often lind themselves less hesitant about
naming a figure than they expected to be.
The next step is to express the allowable error in terms of confidence
limits. Suppose that L is the allowable error in the sample mean. and
that we are willing to take a 5% chance that the error will exceed L. In
other words, we want to be reasonably certain that the error will not ex-
ceed L. Remembering that the 95% confidence limits computed from a
sample mean, assumed approximately normally distributed, are
" 20'
y+-,
- ,In
where we have ignored the finite population correction, we put

L --~
,In
This gives, for the required sample size,
40'2
n=U
In order to use this relation, we must have an estimate of the popula-
tion standard deviation, 0'. Often a good guess can be made from the
results of previous samplings of this population or of other similar popula-
tions. For example, an experimental sample was taken in 1938 to estimate
51T
'he yield per acre of wheat in certain districts of North Dakota (7), For a
sample of 222 fields, the variance of the yield per acre from field to field
was s' = 90.3 (in bushels'). How many fields are indicated if we wish to
estimate the true mean yield within ± 1 bushel, with a 5% risk that the
crear will exceed I bushel? Then
40" '4(90.3)
n= - = - _ = 361 fields
L' (1)2

If this estimate were being used to plan a sample in .ome later year, it
would be regarded as tentative. since the variance between fields might
change from year to year.
In default of previous estimates, Deming (3) has pointed out that 0'
can be estimated from a knowledge of the highest and lowest values in the
population and a rough idea of the shape of the distribution. Ifh = (high-
est - lowest); then (J = 0.29h for a uniform (rectangular) distribution,
0' = 0.2411 for a symmetrical distribution shaped like an isosceles triangle,
and (J = 0.2 [ft for a skew distrib\ltion shaped like a right triangle.
If the quantity to be estimated is a binomial proportion, the allowable
error. L. for 95% confidence probability is

L= 2J~q
The sample size required to attain a given limit of error. L. is therefore
4pq
" = -,
L
(17.6.1)
.~
In this formula. p, q, and L may be expressed either as propotU;;;;s or as
percentages, provided they are all expressed in the same units. The
result necessitates an ad,'ance estimate of p. If P is likely to lie between
35% and 65°/~, the advance estimate can be quite rough, since the product
pq varies little for p lying between these llmits. If. however, p is near zero
or 100°0' accurate determination of n requires a close guess about the
value of p,
We have ignored the finite population correction in the formulas pre-
sented in this section. This is satisfactory for the majority of applications.
If the computed value ofn is found to be more than 10% orthe population
size, N, a revised value 11' which takes proper account of the correction
is ohtained from the relation
, n
n :=--
1+4>
For example. casual inspection of a batch of 480 seedlings indicales Ihal
about 15'%" arc diseased. Suppose we wish to know the size of sample
needed to determine p, the per cent diseased. to within ± 5%, apart from
a l-in-20 chance. Formula 17.6.1 gives
518 Chopt.r 17: De,ign and Analysi, of Sampling
4(15)(85) .
n = ._- = 204 seedlings
(25)
At this point we might decide that it would be as quick to classify every
seedling as to plan a sample that is a substantial part of the whole batch.
If we decide on sampling, we make a revised estimate, n', as
n 204
n' = - - = - - - - - - = 143
1 + cI> 204
1+-
480
The formulas presented in this section are appropriate for simple ran-
dom sampling, If some other sampling method is to be used, the general
principles for the determination of n remain the same, but the formula for
the confidence limits, and hence the formula connecting L with n, will
change, Fonnulas applicable to more complex methods of sampling can
be obtained in books devoted to the subject, e,g" (2, 4), In practice, the
formulas' in this section are frequently used to provide a preliminary
notion of the value of n, even if simple random sampling is not intended
to be used, The values of n are revised later if the proposed method of
sampling is markedly different in precision from simple random sampling,
When more than one variable is to be studied, the value of n is first
estimated separately for each of the most important variables, If these
values do not differ by much, it may be feasible to use the largest of the
n's. If the n'. differ greatly, one method is to use the largest n, but to
measure certain items on only a sub-sample of the original sample, e.g., on
200 sampling units out of (,000. In other situations, great disparity in
the n's is an indication that the investigation must be split into two or morc
separatesurveys, '

EXAMPLE li.6.1-·A simple random sample of houses is to be taken to estimate the


percentage of houses that are unoccupied. The estimate is desired to be correct to within
± 10 0 , with 95,% confidence. One advance estimate is that the percentage of unoccupied
houses will be about 6%, another is that it will be about 4° o' What sizes of sample are re-
quired on these two forecasts':' What size would you recommend'.)
EXAMPLE 17.6.2 - The total number of rats in the residential part of a large city is to
be e~timated with an error of not more than 10 n. apart from a 1-in-20 chance. In a previous
0

survey, the mean number of rats per city block was nine and the sample standard deviation
was 19 (the distribution is extremely skew). Show that a simple random sample of around
450 blocks should suffice.
EXAMPLE 17.6.3--West (1) quotes the following data for 556 full-time farms in
Seneca ('oumy. New York

Mean Standard De,,'iatlon Per Farm

Aeres in corn 8,8 9,0


~res in "'mall grain~ 41.0 39.5
Acres 10 hay 27.9 16.9
519
If a coefficient of vanatlon of up to 5'1" can be tolerated. show tha.t a random sample
of about 240 farms is. required to estimate the total acreage of each crop in the 556 rarms with
thi!lo degree of precision. (Note that the finite population correction must be used.) This
example illustrates a result that has b«n reached by several different investigator,,; with small
(arm populations such as counties. a substantial part or the whole population must be
5.ampJeJ In order 10 obt.ain accurale estimates.

17.7-Systematic sampling. In order to draw a IO:%', sample from a


list of 730 cards. we might select a random number between I and 10. say
3. and pick every 10th card thereafter; i.e., the cards numhered 3. 13.23.
and so on. ending with the card numbered 723. A sample of this kind
is known as a systematic sample. since the choice or its first member, 3,
determines the whole sample.
Systematic sampling has two advantages over simple random sam-
pling. It is casier to draw. since only one random number is required. and
it distnbutes the sample more evenly over the listed population. For this
reason systematic sampling often gives more accurate results than simple
random sampling. Sometimes the increase in accuracy is large. In
routine sampling. systematic selection has become a popular technIque.
There are two potential disadvantages. If the population contains
a periodic type of variation, and if the interval between successive units
in the systematic sample happens to coincide with the wave length (or
a·multiple of it) we may obtain a sample that is badly biased. To cite
extreme instances, a systematic sample of the house.s in a city might con-
tain far too many, or too few. corner houses; a systematic sample from a
book of names might contain too many, or too few. names listed first on a
page. who might be predominantly males, or heads of households. or
persons of importance. A systematic sample of the plants in a field might
have the selected plants at the same positiobs, along every row. These
situations can be aVOided by being on the lookout for them and either
using some other method of sampling or selecting a new random number
frequently. In field sampling. we could select a new random number in
each row. Consequently\ it is well to know something about the nature
of the vanability in the population before decidlDg to use ,ystematic
sampling.
The second disadvantage is that from the results of a systematic sam-
ple there is no reliable method of estimating the standard error of the sam-
ple mean. Textbooks on sampling give various formulas for Sy that may be
tried: each formula IS valid for a certain type of population. but a lormula
can be used With confidence only if we have evidence that the population
is of the type to which the formula applies. However, systematic sampling
often is a part of a more complex sampling plan in which it is possible to
obtain unbiased estimates of the sampling errors.

eXAMPLE /7.7.1 -- The rtlrfl~1S(' (If this e~ample IS (0 compare simpk r.wdom ~a:fn·
piing and S~II.'ll)allC' !\amplmg nf a limal! popUlation. The following data an: thi' weightS of
maize (in 10-g01. unitsl for 40 sw:o.:l.',sive hills lymg in a s.ingle row: 104.38.105.86.63.31.
47. O. RO. 42.37.48. R5. 66.110. O. 73. (l~. 101. 47. 0.36.16. B, 22, 32. 31 O. ~5. 82. 31. 45. lO.
520 Chapt .. 17: De.ign and AnalYli$ "I Sampling
76,45,70,70.63,83, j4. To save you time, the population standard deviation is given as 30.1.
Compute the standard deviation of the mean of a simple random sample of 4 hills. A sys-
tematic sample of 4 hills cttn be taken by choosing a random number between 1 and 10 and
taking every 10th ruU thereafter. Find the mean Y., for each of the 10 possible systematic
samples and compute the standard deviation of these means about the true mean Y of the
population. Note that the formula for the standard deviation is

:!:(-=--y;>
(-)- Y.,- ,
(J Y•• -
J 10

Verify that the standard deviation of the estimate is about S~~ lower with systematic sam-
pling. To wha~ do you think this difference is due?

17.8-Stratified sampling. There are three steps in stratified sam-


pling:
(I) The population is divided into a number of parts, called strata.
(2) A sample is drawn independently in each part.
(3) As an estimate of the population mean, we use

_ r.N..Y.
Ys,=~,

where Nh is the total number of sampling units in the hth stratum, Y. is


the sample mean in the hth stratum and N = r.N, is the size of the popu-
lation. Note that we must know the values of the Nh (i.e .. the sizes of the
strata) in order to compute this estimate.
Stratification is commonly employed in sampling plans fof several
reasons. It can be shown that differences between the strata means in
the population do not contribute to the sampling error of the estimate
PH" In other words, the sampling error of Yst arises solely from variations
among sampling units that are in the same stratum. Ifwe can form strata
so that a heterogeneous population is divided into parts each of which is
fairly homogeneous. we may ""pect a gain in precision over simple random
sampling. In taking 24 sailor crop samples from a rectangular field, we
might divide the field into 12 compact plots. and draw 2 samples at random
from each plot. Since a small piece of land is usually more homogeneous
than a large piece. this stratification will probably bring about an incre'ase
in precision, although experience indicates that in this application the
increase will be modest rather than spectacular. To estimate total wheat
acreage from a sample of farms. we might stratify by size of farm. using
any information available for this purpose. In this type of application the
gain in precision is frequently large.
In-stratified sampling. we can choose the size of sample that is to be
taken from any stratum. This freedo n of choice gives us scope to do an
efficient job of allocating resources to the sampling within strata. In some
applications, this is the principal reason for the gain in precision from
stratification. Further, ·when different parts of the population present
different problems of listing and sampling, stratification enables these
521
problems to be handled separately. For this reason, hotels and large.
apartment houses are frequently placed in a separate stratum in a sample
of the inhabitants ofa city.
We now consider the estimate from stratified sampling and its stan-
dard error. For the population mean, the estimate given previously may
be written
I
y" = N '5:.N.y. = '5:. W.Y.,

where W. = N.IN is the relative weight attached to the stratum. Note


that the sample means, h, in the respective strata are weighted by the
sizes, Nh , of the strata. The arithmetic mean of the sample observations
is no longer the estimate except in one important special case. This occurs
with proportional a/location, when we sample the same fraction from every
stratum. With proportional allocation,
II

N
It follows that

w.h_- N. _ ".
-
N II

Hence,
'5:.11 Y
YSI = L~v,.y,_ = ~ = y,

since '5:.lI hYh is the total of all observations in the sample .. .with propor-
tional allocation, we are saved the trouble of computing a weighted mean:
the sample is self-weighting.
In order to avoid two levels of subscripts. we use the symbol sCv,,) to
denote the estimated standard efTor of .i"'SI' Its value is

where s/ is the sample variance in the hth stratum, i.e ..


'5:.( Y" _ Y.)'
.S ,'
II, - I

where Y", is the ith member of·the sample from the hth stratum. This
formula for the standard error of y" assumes that simple random sampling
is used within each stratum and does not include the finite population
522 Chapter 17: 0..;." .... Aa.IyJis 01 Sampliftg
correction. If the sampling fractions 4>. exceed \ 0"1. in some of the strata,
we use the more general formula .

(17.8.1)

With proportional allocation the sampling fractions 4>. are all equal and
the general formula simplifies to

s(y.,) =
!r.W.s.' ../(I
-..j-n-· - 4»

If, further, the population variances are the same in all strata (a reason-
able assumption in some applications), we obtain an additional simplifica-
tion to
S(Y,,) = In ../(1 - 4»

This result is the same as that for the standard error of the mean with
simple ran<lom sampling, except that sw, the pooled standard deviation
within strata, appears in place of the sample standard deviation, s. In
practice, Sw is computed from an analysis of variance of the data.
As an example of proportional allocation, the data in table 17.8.1
come from an early investigation by Clapham (I) of the feasibility of
sampling for estimating the yields of small cereal plots. A rectangular plot
of wheat was divided transversely into three equal strata. Ten samples,
each a meter length of a single row, were chosen by simple random sam-
pling from each stratum. The problem is to compute the standard error
of the estimated mean yield per meter of row.
TABLE 17.8.1
ANALYSIS Of VARIA-NeE OF ASTRA TlFlED RANDOM SAMPLE
(Wheat grain yields - gm. per meter)

Source of Variation Degrees of Freedom Sum of Squares Mean Square

Total 29 8.564 295.3


Between strata 2 2.073 1,036.5
Within strata 21 6,491 240.4

In this example, Sw = ../240.4 = 15.5, and n = 30. Since the sample


is only a negligible part of the whole plot, nlNis negligible and

- s. 15.5 2 83
s (y,,) = ,./n = j30 = . gm.
523
How effective was the stratification ') From the analysis of variance
it is seen that the mean square between strata is over four times as large
as that within strata. This is an indication of real differences in level of
yield from stratum to stratum. It is possible to go further. and estimate
what the standard error of the mean would have been if simple random
sampling had been used without any stratification. With simple random
sampling. the corresponding formula for the standard error of the mean is
s
sf = -,-'
.,;n
where s is the ordinary sample standard deviation. In the sample under
discussion. s is~' 295.3 (from the lOla/ mean square in table 17.8.1). Hence.
as an estimate of the standard error of the mean under simple random
sampling. we might take
,j295.3
sf = -J30 = 3.14 gm ..

as compared with 2.83 gm. for stratified random sampling. Stratification


has reduced the standard error by about lO~o'
This comparison is not quite correct. for the rather subtle reason that
the value ofs was calculated from the results of a stratified sample and not.
as it should have been. from the results ofa simple random sample. Valid
methods of making the comparison are described for all types of stratified
sampling in (2). The approximate method which we used is close enough
when the stratification is proporti'onal and at least len sampling units are
drawn from every stratum. . _
EXAMPLE \7.8.1 In the example of stratified sampling given in sectlon-~
that the estimate which we u\cd for the population total was 1...'T..... From formula 17,8.\ for
the standard error of _\'_._ "'elif), that the \a,riance orthe estimated popUlation total i~ 4~_75. a~
found directly in section 17.2. (Note that stratum I makes no contribution to thl~ variance
because '1~ = N~ in that stratum.,
"-
17.9-Choice of sample sizes in the ilKU.idual strata. It is some-
times thought that in stratified sampling we should sample the same frac-
tion from every stratum; i.e .. we should make nil /"0/" the same in all strata.
using proportional allocation. A more thorough analysis of the problem
shows. however. that the optimum allocation is to take proportional to "II
N,,(1,,/ .,./ c~.. where (111 is the standard deviation of the sampling units in the
hth stratum. and '" is the cost of sampling per unit in the IlIh stratum. This
method of allocation gives the smallest standard error of the estimated
mean .f'Sf for a given total cost of taking the sample. The rule tdl') us 10
take a larger sample, as compared with proportional allocaTion. in a
stratum thal is unusua!!y variable (a" large), and a smaller samrle in a
stratum where sampling is unusually expensive (e" large). Looked at in
524 Chapter 17: Design and AnalYsis of Sampling
this way. the rule is consistent with common sense, as statistical rules
always are if we think about them carefully. The rule reduces to pro-
portional allocation when the standard deviation and the cost per unit
are the same in all strata.
In order to apply the rule. advance estimates are needed both of the
relative standard deviations and of the relative costs in different strata.
These estin1atcs need not be highly accurate: rough estimates often give
results sati~factorily ncar to the optimlllll alloca:ion. When a population
is sampled repeatedly, the estimate5 can be obtained from the results of
previous samplings. Even when a population is sampled for the first
time, it is sometimes obvious that some strata are more accessible to
sampling th<1n others. ]n this event it pays to hazard a guess about the
differences in costs. In other situation:" we are unable to predict. with any
confidence which strata will be more variable or more costly, or we think
that any such difference, will be small. Proportional allocation is then
used.
There is one common ~jtualion in which disproportiondle ~ampling
pays large dividends. This occurs when the principal variable that is
being measured has a highly skewed or asymmetrical distribution. Usuat-'
Iy, such popUlations contain a few sampling units that have large values
for this variable and many units that have small values. Variables that
are related to the sizes of economit institutions are often of this type, for
instance, the total sales of grocery stores, the number of patients per hos-
pital, the amounts of butter produced by creameries, family incomes, and
prices of houses.
With populations of this type, stratification by size of institution is
highly effective, and the optimum allocation is likely to be much better
than proportional allocation. As an illustratiOn, table 17.9.1 shows
data for the nllmber of students per institution in a popUlation consisting
of the 1.019 senior colleges and universities in the United States. The
data, which apply O1ostly to the 1952-1953 academic year. might be used

TABLE t7.9.1
DATA FOR TOTAL REGISTRATIONS Pl:.R SENIOR COLLF.GEOR UNIV!RSliY,
ARRANGED IN FUUR STRATA

Stratum: Number Total Mean Stand:\rd


Number of of Registration Per DeviatIOn Pel
Students Institutions for the Institution In~t\tution
Per Institution N. Stratum y. ",
Less [han!.ooo 661 292,671 443 236
1,OOO-.~.OOO 205 345,302 1,684 62;
J,OOO-IO,()()() 122 672,728 5,514 2,008
Over 10,000 It 573,693 18 1 506 10,023

Total I,Ot9 1,884,394


525
as background information for planning a sample designed to give a quick
estimate of total registration in some future year. The institutions are
arranged in four strata according to size.
Note that the 31 largest universities, about 3% in number, have 30%
of the students, while the smallest group, which contains 65% of the in-
stitutions, contributes only 15% of the students. Note also that the
within-stratum standard deviation, 0"" increases rapidly with increasing
size of institution.
Table 17.9.2 shows the calculations needed for choosing the optimum
sample' sizes within strata. We are assuming equal costs per unit within
all strata. The products, N,O"" are formed and added over all strata.
Then the relative sample sizes, N,i7J};Nhi7h' are computed. These ratios,
when mUltiplied by the intended sample size n, give the sample sizes in
. the individual strata.
TABLE 17.9.2
CALCULATIONS FOR OBTAINING THE OPTIMUM SAMPLE SIZES IN INDIVIDUAL STRATA

Stratum: Number of Relative Actual Sampling


Number of Institutions Sample Sizes Sample Rate
Students N. Nil!" N"tJlr.ilN,,(Jh Sizes (%)

Less than 1.000 661 155,996 .1857 65 10


1,000-3,000 205 128.125 .1526 53 26
3.000-10.000 122 244,976 .2917 101 83
Over 10.000 31 310.713 3100 31 100

Total 1,019 839,810 1.0000 250 ~-

As a consequence of the large standard deviation in the stratum with


the largest universities, the rule requires 37"" of the sample to be taken
from this stratum. Suppose we are aiming at a total sample size of 250.
The rule then calls for (0.37)(250) or 92 universities from this stratum
although the stratum contains only 31 universities in all. With highly
skewed populations, as here, the optimum allocation may demand 100%
sampling, or even more than this. of the largest institutions. When this
situation occurs, the best procedure is to take 1OO{: 0 of the "large" stratum,
and employ the rule to distribute the remainder of the sample over the
other strata. Following this procedure. we include in the sample all 31
largest institutions, leaving 219 to be distributed among the first three
strata. [n the first stratum, the size of sample is

~1857 }
219
{ 0.1857
= 65
+ 0.1526 + 0.2917
The allocations, shown in the second column from the right of table
17.9.2, call for over 80% sampling in the second largest group ofinstitu-
526 Chapt.r 17: Design and Analysis of Sampling
tions (101 out of (22). but only a 10~:, sample of the small colleges. In
practice we might decide, for administrative convenience, to take a 100%
sample in the second largest group as well as in the largest.
It is worthwhile to ask: Is the optimum allocation much superior to
proportional allocation? If not. there is little value in going to the extra
trouble of calculating and using the optimum allocation. We cannot. of
course. answer this question for a future sample that is not yet taken. but
we can compare the two methods of allocation for the 1952-1953 registra-
tions. To do this. we use the data in tables 17.9.1 and 17.9.2 and the
standard error formulas in section 17.8 to compute the standard errors of
the estimated population totals by the two methods. These standard
errors are found to be 26.000 for the optimum allocation. as against
107.000 for proportional allocation. If simple random sampling had been
used: with no stratification, a similar calculation shows that the corre-
sponding standard error would have been 216.000. The reduction in the
standard error due to stratification~ and the additional reduction due to
the optimum allocation. are both striking. In an actual future sampling
based on this stratification. the gains in precision would presumably be
slightly less than these figures indicate.
EXAMPLE \7.9.1 - For the populatIon of college\> and universities discu~sed in'this
section it was stated that a stratified sample of :!50 institutions. with proportional a-lloca-
(ion, would have a standard error of 107,000 for the e~umated (otal regiscraciol'l in all 1.019
il'lstilUtions. verify thIS statemen! from the dal<l in table 17.9.1. Note that the standard
error of the estimated popUlation total. with proportional allocation. IS

~
. ,_..
rrw.;;'
Y It
Jl· ")
1- -
/Ii

17.10-Stratified sampling for attributes. If an attribute is being


sampled. the estimate appropriate to stratified sampling is
p" = r W"p,
where p, is the sample proportion in stratum hand W, = N,/ N is the strat-
um weight. To find the standard error of p" we substitute p,q, for S, ' in
the formulas previously given in section 17.8.
As an example. consider a sample of 692 families in iowa to deter·
mine. among other things. how many had vegetable gardens in 1943.
The families were arranged in three strata- urban, rural non·farm, and
farm- because it was anticipated that the three groups might show differ-
ences in the frequency and Size of vegetable gardens. The data are given
in table 17.10.1.
The numbers of families were taken from the 1940 census. The
sample was allotted roughly in proportion to the number of families per
stratum. a sample of I per 1.000 being aimed at.
The weighted mean percentage of Iowa families having gardens was
estimated as
r ~~p> = (0.445)(72.7) + (0.230)(94.8) + (0.325)(966) = 85.6~o
527
TABLE 17,]0.1
NVM8(RS Of V['(.c:tABLf GARDENS AMONG low .... FAMiLIES. ARRANGED '" THREE Sn:~T·\

Number of Number in Number P~n:enta!!..:


FamjJie~ Weight Sample With With
Stratum N W Gardens Gardens
• • n•
Urban 312.393 0.445 300 21S 12.7
Rurall'lon-farm
! 161.077 0.230 155 J47 94.S
Farm 228.354 0.325 ~37 229 96.6

Total 701.824 1.000 692 594

This is practically the same as the sample mean percentage. 594/692


or 8S.8~_I~.because allocation was so close to proportional.
For the estimated variance of the estimated mean, we have
:!: IV, 'p,qh}'n, = (0.445)2(72.7)(27.3)/300 + etc. = 1.62
The standard error. then. is 1.27%.
With a sample of this size. the estimated mean will be approximately
normally distributed: the confidence lirnits may be set as
85.6 ± (21( 1.27) : 83.I~o and 88.1';;;
For the optimum choice of the sample sizes within strata. we should
take ii, proportional to N• .J p,q" c,. If the cost of sampling is about the
same in all strata. as is true in many surveys, this implies that the fraction
sa t1l pled. n"'N,, should be proportional to .,(p,q,. Now the quantity
"pq changes little as p ranges from 25% to 75~". Consequently, propor-
tional allocation is often highly efficient in stratified sarnpling for attri-
butes. The optimurn allocation produces a substantial reduction in the
standard error. as compared with proportional aHocation,only when.ome
of the Ph are close to zero or }OO<:l o. or when there are differential costs.
The exarnple on vege'table gardens departs from the strict principles
of stratified sampling in that the slrata sizes and weights were not known
exactly. being obtained from census data three years previously. Errors
in the strata weights reduce the gain in precision from stratification and
make the standard forrnulas inapplicable. It is believed that in this
example these disturbances are of negligible importance. Discussions of
s(ratification when errors in the weights are present are given in (2) and
(10).

EXAMPLE 17.1O.1-·'n stratified sampling for attributes. (he optimum sample distribu.
tion. with equal costs per unit in all strata. follows from laking III< pr<Jponionai to N~"P~I/h.
It follows (hal the actual ....alue of n~ is
528 Chapt.r rT: Derign """ Analysis of Sampling
In the Iowa vegetable garden survey, suppose that thep,. values found in the sample can be
assumed to be the same as those in the population. Show that the optimum sample distribu-
tion gives sample sizes of 445, 115, and 132 in the respective strata, and that the standard
eCTor of the estimated percentage with gardens would then be 1.17%, as compared with
1.27i~ in the sample itself.

17.II-Sampling in two stages. Consider the following miscellaneous


group of sampling problems: (I) a study of the vitamin A content of
butter produced by creameries, (2) a study of the protein content "fwheat
in the wheat fields in an area. (3) a study of red blood cell counts in a
population of men aged 20-30, (4) a study of insect infestation of the leaves
of the trees in an orchard, and (5) a study of the number of defective
teeth in third-grade children in the schools of a large city. What do these
investigations have in commOn? First, in each study an appropriate sam-
pling unit suggests itself naturally-the creamery. the field of wheat. the
individual man, the tree, and the school. Secondly, and this is the im-
portant point, in each study the chosen sampling units can be sub-sampled
instead of being measured completely. Ind~ed. sub-sampling is essential
in the first three studies. No one is going to allow us to take all the butter
produced by a creamery in order to determine vitamin A content, Or
all the wheat in a field for the protein determination, or all the blood in a
man in order to make a complete count of his red cells. In the insect
infestation study, it might be feasible. although tedious, to examine all
leaves on any selected tree. If the insect distribution is spotty, however,
we would probably decide to take only a small sample of leaves from any
selected tree in order to include more trees. In the dental study we could
take all the third-grade children in any selected school or we could cover
a larger sample of schools by examining only a sample of children from
the third grade in each selected school.
This type of sampling is called sampling in two slages, or sometimes
sub-sampling. The first stage is the selection of a sample of primary sam-
pling units-the creameries. wheat fields. and so on. The second stage is
the taking of a sub-sample of second-slage units, or sub-units, from each
selected primary unit.
As illustrated by these examples. the two-stage method is sometimes
the only practicable way in which the sampling can be done. Even when
there is a choice between sub-sampling the units and measuring them com-
pletely, two-stage sampling gives the sampler greater scope, since he can
choose both the size of the sample of primary units and the size of the sam-
ple that is taken from a primary unit. In some applications an important
advantage oftwo-stage sampling is that it facilitates the problem of listing
the population. Often it is relatively easy to obtain a list of the primary
units. but difficult or expensive to list all the sub-units. To list the trees
in an orchard and draw a sample of them is usually simple, but the prob-
lem of making a random selection of the leaves on a tree may be very
troublesome. With two-stage sampling this problem is faced only for
those trees that are in the sample. No complete listing of all leaves in the
orchard is req uired.
529
In the discussion of two·stage sampling we assume at first that the
primary units are of approximately the same size. A simple random sam·
pie of n, primary units is drawn, and the same number n, of sub-units is se·
lected from each primary unit in the sample. The estimated standard
error of the sample mean y per sub·unit is then given by the formula

s- = ---
l:(.ji, - W,
, .jn, n, - I

where y, is the mean per sub-unit in the-ith primary unit. This formula
does not include the finite population correction, but is reliable enough
provided that the sample contains less than 10% of all primary units.
Note that the formula makes no use of the individual observations on the
sub-units. but only of the primary unit means y,. If the sub-samples are
taken for a chemical analysis, a common practice is to composite the
sub-sample and make one chemical determination for each primary unit.
With data of this kind we can still calculate s,.
In section 10.13 you learned about the "components of variance"
technique, and applied it to a problem in two-stage sampling. The data
were concentrations of calcium in turnip greens, four determinations be-
ing made for each of three leaves. The leaf can be regarded as the primary
sampling unit, and the individual determination as the sub-unit. Byapply-
ing .the components of variance technique, you were able to see how the
variance of the sample mean was affected by variation between determina-
tions on the same leaf and by variation from leaf to leaf. You could also
predict how the variance of the sample mean would change with different
numbers of leaves and of determinations per leaf in the experiment.
Since this technique is of wide utility in two·stage sampling, we shall
repeat some of the results. The observation on any sub-unit is considere4_
to be the sum of two independent terms. One term, associated with the
primary unit, has the same value for all second-stage units in the primary
unit, and varies from one primary unit to another with variance (11 2 . The
second term, which serves to measure differences between second-stage
units, varies independently from one sub-unit to anOther with variance
n,'. Suppose that a sample consists of n, primary units, from each of
which n, sub·units are drawn. Then the sample as a whole contains n,
independent values of the first term, whereas it contains n In, independent
values of the second term. Hence the variance of the sample mean ji per
sub·unit is

The two components of variance, a 12 and (J 2 2, can be estimated from


the analysis of variance of a two-stage sample that has been taken. Table
17.11.1 gives the analysis of variance for a study by Immer (6), whose
530 Chapter 17: Des;gn and AnoIysis 01 StmIp/ing
object was to develop a sampling technique for the determination of the
sugar percentage in field experiments on sugar beeiS. Ten beels were
chosen from each of 100 plots in a uniformity trial, the plots being the
primary units. The sugar percentage was obtained separately for each
beet. In order to simulate conditions in field experiments, the Between
plots mean square was computed as the mean square between plots within
blocks of 5 plots. This mean square gives the experimental error variance
that would apply in a randomized blocks experiments with 5 treatments.
TABLE 17.11.1
ANALYSIS OF VAJUANCE OF SOOAR PERCENTAGE Of BEITS (()N A SINGu·BuT BASIS)

I Degrees of Mean Parameters


Source of Variation ;GFreedom Square Estimated

Between plots (primary units) 80 2.9254


Between beets (sub·units) within plots 900 2.1374
~~---

The estimate of 11 1', the Between plots component of variance, is


51' = 2.9254 10
- 2.1374 = 00788
.,

the divisor 10 being the number of beets (sub-units) taken per plot. As an
estimate of 11/, the within-plots component, we have
5/ = 2.1374
Hence, if a new experiment is to consist of" 1 replications, with ", beets
sampled from each plot, the predicted variance of a treatment mean is
, 0.0788 2.1374
5\....... = --~ + --~
. Nl n 1n 2

We shall illustrate two of the questions that can be answered from


these data. How accurate are the treatment means in an experiment with
6 replications and 5 beets per plot? For this experiment we would expect

5,-_ J(0.0788 2.1374) ~ 0 '9.


-6-+3()-'~%

The sugar percentage figure for a treatment mean would be correct to


within ± (2) (0.29) or 0.58%, with 95% confidence, assuming y approxi-
mately normally distributed.
If the standard error of a treatment mean is not to exceed 0.2%, what
combinations of "1 and ", are allowable? We must have
0.0788 + 2J374 = (0.2)' = 0.04
"1 "I n2
531
Since n, and n, ate whole numbers, they will not satisy this equation
exactly: we must make sure that the left side of the equation does not
exceed 0.04. You can verify that with 4 replications (n 1 = 4), there must
be 27 beets per plot; with 8 replications, 9 beets per plot are sufficient;
and with 10 replications, 7 beets per plot. As one would expect, the in-
tensity of sub-sampling decreases as the intensity of sampling is increased.
The total size of sample also decreases from 108 beets when n, = 4 to 70
beets when n, = 10.

17.12-The aIIoeation or resources in tw....tage sampling. The last


example illustrates a general property of two-stage samples. The same
standard error can be attained for the sample mean by using various
"I
combinations of values of and ",. Which of these choices is the best?
The answer depends, naturally, on the cost of adding an extra primary
unit to the sample (in this case an extra replication) relative to that of
adding an extra sub-unit in each primary unit (in tms case an extra beet
in each plot). Similarly, in the turnip greens example (section 10.13, page
280) the best sampling plan depends on the relative costs of taking an
eXira leaf and of making an extra determination per leaf. Obviously, if
it is cheap to add primary units to the sample but expensive to add sub-
units, the most economical plan will be to have many primary units and
few (perhaps only one) suh-units per primary.unit. For a gelleral solution
10 this problem, hpwever, we require a more exact formulation of the
cOSIS of .... rious alternative plans.
In lTlany sub-sampling studies the cost of the sample (apaT! from
fixed overhead costs) can be approximated by a relation of the form

The factor C, is the average cost per primary unit of those elements of
cost that depend solely on the number of primary units and not on the
amount of sub-sampling. The factor c" on the other hand, is the average
cost per sub-unit of those constituents of cost ttu>t are directly proportional
to the total number of sub-units.
If advance estimates of these constituents of cost are made from a
preliminary study, an efficient job of selecting the best amounts of sam-
pling and sub-sampling can be done. The problem may be posed in two
different ways. In some studies we specify the desired variance V for the
sample mean, and would like to attain tms as cheaply as possible. In
other applications the total cost C that must not be exceeded is imposed
upon us, and we want to get as small a value of Vas we can for tms outlay.
These two problems have basically the same solution. In either case we
want to minimize the product
532 CIIapI.r 17: Design - ' AttoIyoiJ 01 Sampling
Upon expansion, this becomes
2
2
VC = (" c, + '2 2 , 2 S2 C 1
C2 ) + n 2 ., C2 + - -
n2
It can be .hown that this eJ<pression has its smallest value when

This result gives an estimate of the best number of sub-units (beets) per
primary unit (plot). The value of n, is found by solving either the cost
equation or the variance equation for n depending on whether cost or
variance ·has been preassigned. "
In the sugar beet example we had 5,' = 0.0788,5,' = 2.1374, from
which

2.1374t;'
_ - -=52 -
0.0788 c, . c,
Jf';'
In this study, cost data were not reported. If c, were to include the
cost of the land and the field operations required to produce one plot, it
would be much greater than c,. Evidently a fairly large number of beets
per plot would be advisable. In practice, factors other than the sugar
percentage determinations must also be taken into account in deciding
on costs and number of replications in sugar beet experiments.
In the turnip greens example (section 10.13, page 280), n 1 is the num-
ber of leaves and n, the number of determinations of calcium concentra-
tion per leaf. Also, in the present notation.
5,' ~ s} = 0.0724
"
s,' ~ s' = 0.0066
Hence, the most economical number of determinations per leaf is estimated
to be
c,s,' = JO.0066 rc;
= 030 rc;
c,s,' 0.0724~c, . ~'Z,
In practice, n, must be a whole number, and the smallest value it can have
is 1. This equation shows that n, = I, i.e., one determination per leaf,
unless c, is at least 25 times c,. Actually, since c, includes the cost of
tbe chemical determinations, it is likely to be greater than c,. Tbe
relatively large variation among leaves and the cost considerations both
point to the cboice of one determination per leaf.
This example also illustrates that a choice of n, can often be made
from the equation even when information about relative costs is not too
definite. This is because the equation often leads to the same value of n,
for a wide range of ratios of ", to C2' Brooks (14) gives helpful tables for
533
this situation. The values of n, are subject to sampling errors; for a
discussion, see (2).
In section 10.14 you studied an example of three-stage sampling of
turnip green plants. The first stage was represented by plants, the second
by leaves within plants, and the third by determinations within a leaf. In
the notation of this section, the estimated variance of the sample mean is

Copying the equation given in section 10.14, we have


, 0.3652 0.1610 0.0067
Sy = --.-- + - - + - -
n1 n1 nZ nt nZn3
To find the most economical values of n" n" and n" we set upa cost equa-
tion of the form

and proceed to minimize the product of the variance and the cost as before ..
The solutions are

while n, is found by solving either the cost or the variance equation. Note
that the formula for n, is the same in three-stage as in two-stage sampling,
and that the formula for nJ is the natural extension or that for n,. Putting
in the numerical values of the variance components, we obtain

n, = J C
,(0.1610) = 0.66JE.!.,
c,(0.3652) c,
c,(0.0067)
c,(0.161O)
= 0.20Jc,
c,

Since the computed value of n, would be less than I for any likely value
of c,/c" more than one determination" per leaf is uneconomical. The
optimum number n, of leaves per plant depends on the ratio c,/c,. This
will vary with the conditions of experimentation. If many plants 'are
being grown for some other purpose, so that ample numbers are available
for sampling, c, includes only the extra costs involved in collecting a
sample from many plants instead of a few plants. In this event the opti-
mum n, might also turn out to be I. If the cost of growing extra plants is
to be included inc" the optimum n, might be higher than 1.
EXAMPLE l7.12.1-This is the analysis of variance, on a single sub-sample basis, for
wheat yield and perq.entage of protein from data collected in It wheat sampling survey in
Kansas in 1939 (25).
Yield Protein
(Bushels Per Acre) (%)

Degrees of Mean Degrees of Mean


Source of Variation Freedom Square Freedom Square

FieJds 659 434.52 659 21.388


Samples within fields 660 67.54 609 2.870

Two su~samples wert taken at random from each of 660 fields. Calculate the com-
ponents of variance for yield. Ans. Sll = 183.49, S12 "= 61.54. Note: Some of the protein
figures were evidentJy not recorded separateJy for each su1HampJe, since there are only
609 df. within fields.
EXAMPLE 17.12.2-For yield, estimate the variance of the sample mean for samples
consisting of (i) I 5ub-sampJe from each of 800 fields. (ii) 2 sub-samples from each of 400
fields, (iii) 8 samples from eacb of 100 fields. Ans. (i) 0.313, (ii) 0.543. (iii) 1.919.
EXAMPLE 17.12.3-With 2 sub-samples per field. it is desired to take enough fields so
that the standard error of the mean yield wiU be not more than 1/2 bushel. and at the same
time the standard error of the mean protein percentage will be not more than t%. How
many fields. are required? AD$. about 870.
EXAMPLE l7.12.4-Suppose that it takes on the average I man-hour to locate and
pace a field that i~ to be sampled. A single protein determination is to be made on the bulked
sub-samples from any field. The cost of a determination is equivalent to I man-hour. It
takes' 15 minutes to I~te, cut, and tie a sub-sample. From these data and the analysis of
variance for protein percentage (example 17.12,1), compute the variancc.oCost product, ve,
for each value of "2 from I to 5. What is the most economical number of sub-samples per
field1 Ans.2. How much more does it cost, for the same V, if 4 sub-samples per field are
used? Ans. 12%.

17.13-8election with probability proportional to size. In many im-


portant sampling problems, the natural primary sampling units vary in
size. In national surveys conducted to obtain information about the
characteristics of the popUlation, the primary unit is often an adminis-
trative area (e.g., similar to a county). A relatively large unit of this type
cuts down travel costs and makes supervision and control of the field
work more manageable. Such units often vary substantially in the num-
ber of people wey contain. A sample of the houses in a town may use
blocks as first-stage units, the number of houses per block ranging from
o to 40. Similarly, schools, hospitals, and factories all contain different
numbers of individuals.
With primary units like this, the belween-primary-unit variances of
the principal measurements may be large; for example, some counties
are relatively wealthy and some are poor. In these circumstances,
Hansen and Hurwitz (15) pointed out the advantages of selecting primary
units with probabilities proportional to their sizes. To illustrate, con-
sider 'a population of three schools, having 600, 300, and 100 children.
The objective is 10 estimate the population mean per child for some char-
acteristic. The means per child in the three schools are Y, = 2, Yz = 4,
Y, = I. Hence, the population mean per child is
SIS
r = [(600)(2) + (300)(4) + (100)(1)]/1,000 = 2.S
To simplify things further, suppose that only one school is to be
chosen, and that the variation in Y between children within tbe same
school is negligible. It follows that we need not specify how the second-
stage sample of children from a school is to be drawn. since any sample
gives tbe correct mean for the chosen sChool.
In selecting the school with probability proportional to size. (Pps),
the three schools receive probabilities 0.6, 0.3, and 0.1. respectively, of
being drawn. We shall compare the mean square error of the estimate
given by this method with that given by selecting the schools with equal
probabilities. Table 17.13.1 contains the calculations.

TABLE 17.13.1
SELECTlON OF A SCHOOl.. W'TH PROItABIUTY PROPORTlONAl TO SIZE

Probability Mean per Error of


No. of of Child Estimate
School I Chjldren Selection :1[( y, y'-y 0;- h'
-, I

I i 600 0.6 2 -0.5 0.25


2 JOO 0.3 4 +1.5 2.25
3
i 100 0.1 I -1.5 2.25

Population
I 1.000 1.0 2.5

If the first school is selected. its estimate is in error by (2.0 - 2.5)


=- 0.5. and so on. These errors and their squares appear in the two right-
hand columns of table 17.13.1. In repeated sampling with probability
proportional to size, the first school is drawn 60~~ of the time. the second
school 30%. and the third schooIIO,!~. The mean square error is therefore

M.S.E pp • = (0.6)(0.2S) + (0.3)(2.25) + (0.1)(2.25) = 1.05

If, alternatively, the schools are drawn with equal probability. the M.S.E
is
M.S.E.,. = H(D.2S) + (2.25) + (2.2S)J = I.5R
This M.S.E is about 50"" higher than that given by pps selection.
You may ask: Does this result depend on the choice or the order of
the means. ::!. 4. I. assigned to schools I. 2. and 3: The answer is yes.
With means 4, 2, I, you will find M.S.E,., = 1.29. M.S.E,. = 2.14. the
latter being 66,%, higher. Over the six possible orders of the numbers
I. 2. 4. the ratio M.S.E,./M.S.L~p, varies from 0.93 to 2.52. However.
the ratio of the averages MS.E,.IM.S.Epp,' taken over all six possible
orders. does not depend on the numbers 1,2.4. With N primary units in
the population. this ratio is
536 Chap'e, 17: D.sign and Analyris of Sompling
N
MT.i:,. (N - 1) + N L (1t, - il)2
= N
M.S.E pp• (N _ 1) - N L (1t, - il)2

where 1t, is the probability of selection (relative size) of the ith school.
Clearly, this ratio exceeds one unless all1t, are equal; that is, all schools are
the same size.
The reason why it usually pays to select large units with higher prob-
abilities is that the population mean depends more on the means of the
large units than on those of the small units. The large units are therefore
likely to give better estimates.
With two-stage sampling, a simple method is to select n primary units
with pps and take an equal number of sub-units (e.g., children) in every
selected primary unit. This method gives every sub-unit in the popUlation
the same chance of being in the sample. The sample mean per sub-unity
is an unbiased estimate of the corresponding population mean, and its
estimated variance is obtained by the simple formula
, -
Sy2 = L (ji, - y)2/n(n - 1), (17.13.1)

where ji, is the mean of the sample from the ith primary unit.
We have illustrated only the simplest case. Formula 17.13.1 as-
sumes that the n units are selected with replacement (i.e., that a unit can
be chosen more than once). Some complications arise when we select
units without replacement. Often, the sizes of the units are not known
exactly, and have to be estimated in advance. Considerations of cost or of
the structure of variability ill_ the population may lead to the selection of
units with probabilities that are unequal, but are proportional to some
quantity other than the sizes. For details, see the references. In extensive
surveys, multistage sampling with unequal probabilities of selection of
primary units is the commonest method in current practice.

17.14-Ratio and regression estimates. The ratio estimare is a differ-


ent way of estimating population totals (or means) that is useful in many
sampling problems. Suppose ·that you have taken a sample in order to .
estimate the population total of a variable. Y, and that a complete count
of the population was made on some previous occasion. Let X denote the
value of the variable on the previous occasion. You might then compute
the ratio
l:Y
R=-,
l:X
where the sums are taken over the sample. This ratio is an estimate of the
present level of the variate relative to that on the previous occasion.
On multiplying the ratio by the known population total on the previous
531
occasion (i.e., by the population total of Xl, you obtain the ratio estimate
of the population total of Y. Clearly, if the relative change is about the
same on all sampling units, the ratio R will be accurate and the estimate
of the population total will be a good one.
The ratio estimate can also be used when X is some other kind of sup-
plementary variable. The conditions for a successful application of this
estimate are that the ratio YI X should be relatively constant over the pop-
ulation and that the population total of X should be known. Consider
an estimate of the total amount of a crop, just after harvest, made from a
sample of farms in some region. For each farm in the sample we record the
total yield, Y, and the total acreage, X, of that crop. In this case the ratio.
R = :!: Y/:!:X, is the sample estimate of the mean yield per acre. This
is multiplied by the total acreage of tbe crop in tlte region, whiclt would
have to be known accurately from some other source. This estimate will
be precise if the mean yield per acre varies little from farm to farm.
The estimated standard error of the ratio estimate 5'. of the popula-
tion total from a simple random sample of size n is, approximately,

:!:(Y _ RX)2
n(n - 1)
The ratio estimate is not always more precise than the simpler esti-
mate NJi (number of units in population x sample mean). It has aeen
shown that the ratio estimate is more precise only if (1, the correlation
coefficient between Y and X. exceeds C x/2C y, where the C's are the co-
efficients of variation. Consequently, ratio estimates must not be used
indiscriminately, although in appropriate circumstances they produce
large gains in precision.
Sometimes the purpose of the sampling is to estimate a ratio, ~e.g.,
ratio of dry weight to totaf weight or ratio of dean woof to totaf woof. Toe
estimated standard error of the estimate is then

s(R) =~ /:!:(Y - RX)2


xV n(n - 1)

This formula has already been given (in a different notation) at the end of
section 17.5, where the estimation of proportions from cluster sampling
was discussed.
In chapter 6 the linear regression of Yon Xand its sample estimate.
Y = y+ bx,
were discussed. With an auxiliary variable, X, you may find that when
you plot Y against X from the sample data. the points appear to lie close
to a straight line, but the line does not go through the origin. This implies
·that the ratio Y/ X is not constant over the sample. As pointed out in
section 6.19, it is then advisable to use a linear regressiun estimate instead
538 ChopI.r 77: ".... _, AIoaiyoio 01 Saal",,.,
of the ratio estimate. For the. population total of Y, the linear regression,
estimate is
NY = N{y + b(X - x)},
where X is the population mean of X. The term inside the brackets is
the sample m~n, y, adjusted for regression. To see this, suppose that you
have taken a sample in which y = 2.35, x = 1.70, X = 1.92, b = + 0.4.
Your first estimate of the population mean would be y = 2.35. But in
the sample the mean value of X is too low by an amount (1.92 - 1.70)
= 0.22. Further, the value of b tells you that unit increase in X is accom-
panied, on the average, by + 0.4 unit increase in Y. Hence, to correct
for the low value of the mean of X, you increase the sample mean by the
amoullt ( + 0.4)(0.22). Thus the adjusted value of jI is

2.35 + (+ 0.4)(0.22) = 2.44 = Y + b(X - x)


To estimate the population total, this value is multiplied by N, tbe number
of sampling units in the popUlation.
The standard error of the estimated population total is, approxi-
mately,

If a finite popUlation correction is required in the standard error formulas


presented in this section, insert the factor .J(l - </». In finite popula-
tions the ratio and regression estimates are both slightly biased, but the
bias is seldom important in practice.
17.15-F'urther~. The \!,Cnoral books on sam\lle surve.,.. that
have become standard, (2), (3), (4), (5), (13), involve roughly the same level
of mathematical difficulty and knowledge of statistics. Reference (3) is
oriented towards applications in business. and reference (13) towards
those in agriculture. Another good book for agricultural applications, at
a lower mathematical level, is (16).
Useful short books are (17), an informal, popular account of some
of the interesting applications of survey methods, (l8), which conducts the
reader painlessly through the principal results in probability sampling
at about the mathematical level of this chapter, and (19), which discusses
the technique of constructing interview questions.
Books and papers have also begun to appear on some of the commor
specific types of application. For sampling a town under U.S. conditions,
with the blockas primary sampling unit, references (20) and (21) are rec-
ommended. Reference (22), intended primarily for surveys by health
agencies to check on the immunization status of children, gives instruc-
tions for the sampling of attributes in local areas. while (24) deals with the
sampiing of hospitals and patients. Much helpful advice on the use 01
539
sampling in agricultural censuses is found in (23). Sampling techniques
for estimating the volume of timber of the principal types and age-<:Iasses
in foresty are summarized in (II). while (9) reviews the difficult prohlem
of estimating wildlife populations.
REFERENCES
I. A. R. CLAPHAM. J. Agrh" Sci.: 19:214 (1929).
2. W. G. COCHllAN. Sampling Techniques. Wiley. 2nd ed. New York: (1963).
3. W. EDWARDS DEMING. Samplt Design in iIIlsiness Research. Wiley, NewYork(I960).
4. M. H. HA.NSEN, W. N. HUR.wlTz, and W. G. MADOW. Sample Survey Methods and
Theory. Wiley, New York (1953).
5. l. Krsn. Survey SampJinx. WHey, New York (1965).
6. F: R. IMMER. J. Agric. Re.f .. 44:633 (I 932}.
1. A. J. KING. D. E. MCCARTY. and M. MCPEAK. USDA Tech. Bull. 814 (1942).
8. J. A. RIGNEY and J. FIELDING REED. J. Amer. Soc. Agron., 39:26 (1947).
9. L. W. SCATTERGOOD. Cbap. 20 in Statistics and Malhemotics ill Biology. Iowa State
College Press (1954).
10. F. f. STEPHAN. J. Morhling, 6:38 (J941).
II. A. A. H"'SEL. Chap. 19 in Stali.tti("s and Malhemotics in Biology. Iowa Slate College
Press (1954).
12. Q. M. WEST. Mimeographed Report. Cornell Univ. Agri~. Ex-p. Sta. (1951).
13. f. Y A.TllS. Sampling Methods for Censuses and Sun'eys, 3rd ed. Charles Griffin.
London (1960).
)4. S. BROOKS. J. Alht'r.-$lalisl. Ass., 50:398 (1955).
15. M. H. H~NSEN and W. N. HURWITZ. Ann. Moth. Stotist., 14:3)3 (1943),
16. M. R. SAMPFORD. An introtiucrion 10 Sampling Theory. Oliver and Boyd. Edinburgh
(1962).
17. M. J. StONIM. Sampling in a Nutshell, Simon and Schuster. New York (1%0).
18. A. STUART. Basic Ideas of Scientific Sampling, Charles Griffin. London (1%2),
19. S. L PAYNE, The Arl ~lAsking Questiom. Princeton University Press (951),
20. T. D. W<X>LSF.Y "Sampling Methods for a SmaU Household Survey," PUh/I(" Hl'alth
Monographs. No. 40 (1956).
21. l. KlSH. Amer. SOl'. Rev .. 17:761 (19521.
22. R. E. SEItFUNG and I. L. SHERMAN. Attribute Sampling Methods. U ,S. Govt. Printmg
Olliee. Washington. D.C. (1965).
23, S. S. ZA}lCOVICR Sampling Methods and Census. FAO. Rome t 1965),
24. J. HESS. D. C RIEDfL. and T. B. F,TlPATRICK_ Pr-ob(Jbilit}' Sampling of Hm;pl/a}J- and
Patien.[s. University of Michigan. Ann Arbor. Mich. (1961}.
25. A. J. KISG and D. E. MCCA.Rn', 1. Marketing. 6:462 (1941).
* APPENDIX TABLES

List of Appendix T_bles.


A. 1 Random digits 54,
A2 Normal distribution. ordinates 547
A3 Normal distribution. cumulative frequency 548
A4 Student's t, percentage points 549
A5 Chi-square. X2 , percentage points 550
A 6 (i) Test for skewness, 5~~ and 1~~ points of g, 552
A 6 (ii) Test for kurtosis, 5~~ and 1°/<) points of g2 552
A 7 (j) 4¥. range analog of t, Hr'!,., 5%. 2~~~, and l~~ points 553
A 7 (ii) Two-sample range analog of t. 10%, 5%, 2'/~, and I'/~ points 554
A8 Sign test, to~,~ 5%: and 1°J.. points 554
A9 Signed rank test, 5% and 1% points 555
A 10 Two-sample signed rank test, 5% and 1% points 555
A II Correlation coefficient. r, 5,/~ and I'/~ points 557
A 12 Transformed correlations, z in terms of r 55S
A 13 Transformed correlations. r in terms of z 559
A 14 (i) F, variance ratio, 5'?~ and I% points 560
A 14 (ii) F, variance ratio, 25~~, 10%, 2.5%. and 0.5% points 564
A 1-5 Studentized range, Q, 5% points 568
A 16 Angular transformation. Angle = arcsin ..Jpercentage 569
A 17 Orthogonal polynomial values 572
A 18 Square roots 573

Notes
Interpolation. In analyses of data and in working the examples in this
book, use of the nearest entry in any Appendix table is accurate enough
in most cases. The following examples illustrate linear interpolation.
which will sometimes be needed.
I. Find the 5% significance level of X' for 34 degrees of freedom.
For P = 0.050, table A 5 gives
df. 30 34 40
X2 43.77 ? 55.76
Calculate (34 - 30)/(40 - 30) = 0.4. Since
34 = 30 + 0.4(40 - 30)
the required va\ue of l is
43.77 + 0.4(55.76 - 43.77) = 43.77 + 0.4(11.99) = 48.57
Alternatively, this value can be computed as
(O.4)xio + (0.6)X~o = (0.4)(55.76) + (0.6)(43.77) = 48.57
541
542 App.nJix TaI>I••
Note that 0.4 multiplies xio. not xio.
2. An analysis gave an F value of 2.04 for 3 and 18 dJ. Find the
significance probability. For 3 and 18 dJ.. table A 14. part II. gives the
following entries:
P 0.25 ? 0.10
F 1.49 2.04 2.42
Calculate (2.04 - 1.49)/(2.42 - 1.49) = 0.55/0.93 = 0.59. By the alterna-
tive method in the preceding example.
p = (0.59)(0.10) + (0.41)(0.25) = 0,16
Finding Square Roots. Table A 18 is a table of square roots. To save
space the entries jump by 0.02 instead ofO.Ol. but interpolation will rarely
be necessary. With very large or very small numbers. mistakes in finding
square roots are common. The following examples should clarify the
procedure.

Step (I) (2) (31 (4)

Mark. Column Square


Number Off Rea~ Reading Root

6,028.0 60.28.0 Jlo" 7.76 77.6


397.2
46.38
3.97.2
46.38
J. 1.99
6.81
19.9
6.81
JIOn
0.194 0.19.4 JIOn 4.40 0.440
0.000893 0.00.08,93
". 2.99 0.0299

In step (I). mark olfthedigits in/wos to the right or left oftheclecimal


point. Step (2) tells which column of the square root table is to be read.
Witl'. >.97.2 and 0.00,0\\,93 read tM ..jn cn\umn, because tl'.ere is a singk
digit (3 or 8) to the left of the first comma that has any non-zero digits to .
its left. If there are (lVo digits to the left of the first comma, as in 6<l,28.0,
read the "lIOn column. Step 0) gives the reading. taken directly from the
nearest entry in the table.
The final step (4) finds the actual square roots. Consider, first. num-
hers greater than I. If column (I) has no comma to the left of the decimal,
as with 46.38. the square root has one digit to the left of the decimal. If
column (I) has one comma to the left of the decimal, as with 60,28.0 and
3.97.2. the square root has two digits to the left of the decimal. and so on.
With numbers smaller than I. replace any pair 00 to Ihe right of the deci-
mal by a single O. Thus, the square root of 0.00,08.93 is 0.0299 as shown.
The square root of 0.00,00,08,93 is 0.00299.
543
TABLE A 1
TEN TflOUSAND RANUOMLY ASSORTED OlGlTS

00-<J4 05-<l9 I(H4 15--19 2<1-24 25--29 .30-34 35--39 4()-44 45--49

00 544i>3 22662 65905 70639 79365 6738! 29085 69831 47058 08186
01 15389 85205 18850 39226 42249 90669 96325 23248 60933 26927
02 85941 40756 82414 020\5 \3858 18030 \6269 65978 on85 \5345
03 6\149 69440 11286 882\8 58925 03638 52862 62733 3345\ 71455
04 05219 81619 10651 67079 92511 59888 84502 72095 83463 75577

05 41417 98326 87719 92294 46614 50948 64886 20002 97365 30976
06 28357 94070 20652 35774 16249 75019 21145 05217 47286 76305
07 17783 00015 10806 83091 91530 36466 39981 62481 49177 75779
08 40950 84820 29881 85966 62800 70326 84740 62660 77379 90279
09 82995 64157 66164 41180 10089 41157 78258 96488 88629 37231

10 96754 17616 55659 44105 47361 34833 86679 23930 53249 27083
II 34357 88040 53364 71726 45690 66334 60332 22554 90600 711D
12 06318 37403 49927 57715 50423 67371 63116 48888 21505 80182
13 62111 52820 07243 79931 89292 84767 85693 73947 22278 11551
14 47534 09243 67879 00544 23410 12740 02540 54440 32949 13491

15 98614 75993 84460 62846 59844 14922 48730 73443 48167 34770
16 24856 03648 44898 0935\ 98795 18644 39765 71058 90368 44\04
\7 96887 12479 8062\ 66223 86085 78285 02432 53342 42846 94171
18 9080\ 21472 428\5 77408 37390 76766 52615 32141 30268 18106
19 55165 773\2 83666 36028 28420 70219 81369 41943 47366 41067
20 75884 12952 84318 95108 72305 64620 913\8 89872 45375 85436
21 16777 37116 58550 42958 21460 439\0 01175 87894 81378 10620
22 46230 43877 80207 88877 89380 32992 91380 03164 98656 59337
23 42902 66892 46134 01432 94710 23474 20423 60137 60609 13119
24 ~1007 00333 39693 28039 10\54 95425 39220 \9714 31782 49037

25 68089 01122 511 II 72373 06902 74373 96199 97017 41273 21546
26 204\ I 67081 89950 16944 93054 87687 96693 87236 77054 33848
27 58212 13\60 06468 \5718 82627 76999 05999 58680 96739 63700
28 70577 42866 24969 61210 16046 67699 42054 \2696 93758 03283
29 94522 74358 71659 62038 79643 79169 44741 05437 39038 13163
30 42626 BM\9 SS6S\ 88678 \740\ 0)252 99547 32404 \7918 62880
31 \605\ 33763 5'1194 16752 54450 1903\ 58580. 47629 54132 60631
3l 08244 27647 33851 44705 94211 46716 11738 55784 95374 72655
33 59497 04392 09419 89964 51211 04894 72882 17805 21896 83864
34 97155 13428 40293 09985 58434 01412 69124 82171 59058 82859
35 98409 661.,2 95763 47420 20792 61527 ' 20441 39435 11859 41567
36 45476 84882 65109 96597 25930 66790 65706 61203 53634 22557
37 89300 69700 50741 30329 11658 23166 0S400 66669 48708 03887
38 50051 95137 91631 66315 91428 12275 24816 68091 71110 33258
39 3175) 85178 31310 89642 98364 02306 246\7 09609 83942 22716

40 79152 53829 77250 20190 56535 18760 69942 /"7448 33278 48805
4\ 44560 38750 83635 56540 64900 42912 13953 79149 18710 68618
42 68328 83378 63369 71381 39564 05615 42451 64559 97501 65747
43 46939 38689 58625 08342 30459 85863 20781 09284 26333 91717
44 83544 86141 15707 96256 23068 13782 08467 89469 93842 55349
45 91621 00881 Q4900 54224 46177 55309 17852 27491 89415 23466
46 9\896 67126 04151 03795 59077 1\848 12630 98375 52068 60\42
47 55751 62515 21108 80830 02263 29303 37204 96926 30506 09808
48 85156 87689 95493 88842 00664 55017 55539 17771 69448 87530
49 07521 56898 12236 60277 39102 62315 12239 07105 11844 01117
5.... Appendix TaIoI"
TABLE A I--(Cominued)
-_------.-------- --_---
50-54 55--59 60-64 65-69 70-74 75-79 80-84 85-89 90-94 95--99
00 59391 58030 52098 82718 87024 82848 04190 96574 90464 29065
01 99567 76364 77204 04615 27062 96621 43918 01896 83991 51141
02 10363 97518 51400 25670 98342 61891 27101 37855 06235 33316
03 86859 19558 64432 16706 99612 59798 32803 67708 15297 28612
04 11258 24591 36863 55368 31721 94335 34936 02566 80972 08188
05 95068 88628 35911 14530 33020 80428 39930 31855 34334 64865
06 54463 47237 73800 91017 36239 71824 83671 39892 60518 37092
07 16874 62677 57412 13215 31389 62233 80827 73917 82802 84420
08 92494 63157 76593 91316 03505 72389 96363 52887 01087 66091
09 15669 . 56689 35682 40844 53256 81872 35213 09840 34471 74441
10 99116 75486 84989 23476 52967 67104 39495 39100 17217 74073
11 15696 10703 65178 90637 63110 17622 53988 71087 84148 11670
12 97720 15369 51269 69620 03388 13699 33423 67453 43269 56720
!l 11666 13841 71681 98000 35979 39719 81899 07449 47985 46967
14 71628 73130 78783 75691 41632 09847 61547 18707 85489 69944
15 40501 51089 99943 91843 41995 88931 73631 69361 05375 15417
16 22518 55576 98215 82068 10798 86211 36584 67466 6937) 40054
17 75112 30485 62173 02132 14878 92879 22281 16783 86352 00077
18 80327 02671 98191 84342 90813 49268 95441 15496 20168 09271
19 60251 45548 02146 05597 48228 81366 34598 72856 66762 17002
20 57430 82270 10421 00540 43648 75888 66049 21511 47676 33444
2i 73528 39559 34434 88596 54086 71693 43132 14414 ]9949 85193
22 25991 65959 70769 64721 86413 33475 42740 06175 82758 66248
23 78388 16638 09134 59980 63806 48472 39)18 35434 24057 74739
24 12477 09965 96657 57994 59439 76330 24596 77515 09577 91871
25 83266 32883 42451 15579 )8155 29793 40914 65990 16255 17777
26 76970 80876 10237 39515 79152 74798 39)57 09054 73579 92359
27 37074 65198 44785 68624 98336 84481 97610 78735 46703 98265
28 83712- 06514 30101 78295 54656 85417 43189 60048 72781 72606
29 20287 56862 69727 94443 64936 08366 27227 05158 50326 59566
30 74261 32592 86538 27041 6Si72 85532 07571 80609 39285 65340
31 64081 49863 08478 96001 18888 14810 70545 89755 59064 07210
32 05617 75818 47750 67ti14 29575 10526 66192 44464 27058 40467
J3 26793 74951 95466 74307 13330 42664 85515 20632 05497 33625
34 65988 72850 48737 54719 52056 01596 03845 35067 03134 70322
35 27366 42271 44300 73399 21105 03280 73457 43093 05192 48657
36 56760 10909 98147 34736 33863 95256 12731 66598 50771 83665
37 72880 43338 93043 58904 59543 23943 11231 83268 65938 81581
38 77888 38100 03062 58103 47961 83841 25878 23746 55903 44115
39 28440 07819 21580 51459 47971 29882 13990 29226 23608 15873
40 63525 94441 77033 12147 51054 49955 58312 76923 96071 05813
41 47606 93410 16359 89033 8%96 47231 64498 31776 05383 39902
42 52669 45030 96279 14709 52372 87832 02735 50803 72744 88208
43 16738 60159 07425 62369 07515 82721 37875 71153 21315 00132
44 59348 11695 45751 15865 74739 05572 32688 20271 65128 14551
45 12900 71775 29845 60774 94n4 2181D 38636 33717 67598 82521
46 75086 23537 49939 33595 13484 97588 28617 17979 70749 35234
47 99495 51434 29181 09993 38190 42553 68922 5::!125 91077 40197
48 26075 31671 45386 36583 934S9 48599 520:!::! 41330 60651 91321
49 11636 93596 23377 51133 95126 61496 42474 45141 46660 42338
545
TABLE A I-(Conli"""d)
00-04 O~ W-14 1.5-19 20-24 2.5-29 30-34 3.5-39 40-44 45-49

50 64249 63664 3%52 40646 97306 31741 07294 84149 46797 82487
51 26538 44249 04050 48174 65570 44072 40192 51153 11397 58212
52 05845 00512 78630 55328 18116 69296 91705 86224 29503 57071
53 74897 68373 67359 51014 33510 83048 17056 72506 82949 54600
54 20872 54570 35017 88132 25730 22626 86723 91691 13191 77212

55 31432 96156 89177 75541 81355 24480 77243 76690 42507 84362
56 66890 61505 01240 00660 05873 13568 76082 79112 57913 93448
57 41894 57790 79970 33106 86904 48119 52503 24130 72824 21627
58 11303 87118 81471 52936 08555 28420 49416 44448 04269 27029
59 54374 57325 16947 45356 78371 10563 97191 53798 12693 27928
6Q 64852 34421 61046 90849 13966 39810 42699 21753 76192 10508
61 16309 20384 09491 91588 97720 89846 30376 76970 23063 35894
62 42587 37065 24526 72602 57589 98131 37292 05%7 26002 51945
63 40177 98590 97161 41682 845J3 67588 62036 49967 01990 12308
64 82309 76128 93965 26743 24141 04838 40254 26065 07938 76236

65 79788 68243 59732 04257 27084 14743 17520 95401 55811 76099
66 40538 79000 89559 25026 42274 23489 34502 75508 06059 86682
67 64016 73598 18609 73150 62463 33102 45205 87440 96767 67042
68 49767 12691 17903 93871 99721 79109 09425 26904 07419 76013
69 76974 55108 29795 08404 82684 00497 51126 79935 57450 55671

70 23854 08480 85983 96025 50117 64610 99425 62291 86943 21541
71 68973 70551 25098 78033 98573 79848 31778 29555 61446 23037
72 36444 93600 65350 !4971 25325 00427 52073 64280 18847 24768
73 03003 87800 07391 11594 21196 00781 32550 57158 58887 7304!
74 17540 26188 36647 78386 04558 61463 57842 90382 77019 24210

75 38916 55809 47982 41968 69760 79422 80154 91486 19180 15100
76 64288 19843 69!l2 42502 48508 28820 59933 72998 99942 10515
77 86809 51564 38040 39418 49915 19000 58050 16899 79952 57849
78 99800 99566 14742 05028 3003) 94889 53381 23656 75787 59223
79 92345 31890 95712 08279 91794 94068 49337 88674 35355 12W~
~~

80 90363 65152 32245 82279 79256 80834 06088 99462 56705 06118
81 64437 32242 48431 04835 39()70 59702 31508 60935 22390 52246
82 91714 53662 28373 J4333 55791· 74758 51144 18827 10704 76803
83 20902 17646 31391 31459 3JJl5 03444 55743 74701 58851 27427
84 12217 86007 70371 52281 14510 76094 %579 54853 78339 20839
,
85 45177 02863 42307 53571 22532 74921 17735 42201 80540 54721
86 28325 90814 08804 52746 47913 54577 47525 77705 95330 21866
87 29019 28776 56116 54791 64604 08815 46049 71186 34650 14994
88 84979 81353 56219 67062 26146 82567 33122 14124 46240 92973
89 50371 26347 48513 63915 11158 25563 91915 18431 92978 11591

90 53422 06825 69711 67950 64716 18003 49581 45378 99878 61130
91 67453 35651 8?316 41620 32048 70225 47597 33137 31443 51445
92 07294 85353 74819 23445 68237 07202 99515 62282 53809 26685
93 79544 00302 45338 16015 66613 88968 14595 63836 77716 79596
94 64144 85442 82060 46471 24162 39500 87351 ]6637 42833 71875

95 90919 11883 58318 00042 52402 28210 34075 33272 00840 732~8
96 06670 57353 86275 92276 77591 46924 60839 55437 03183 13191
97 36634 93976 52062 83678 41256 60948 18685 48992 19462 96062
98 75101 72891 85745 67106 26010 62107 60885 37503 55461 71213
99 05112 71222 72654 51583 05228 62056 57390 42746 39272 96659
546 AppenJix Table.
TABLE A I-(C_btued)
SO-54 55-59 60-M 6~9 70-74 75--79 80-84 85-89 90-94 95--99
50 32847 31282 03345 89593 69214 70381 78285 20054 91018 16742
51 16916 00041 30236 55023 14253 76582 12092 86533 92426 37M5
52 66176 34037 21005 27137 03193 48970 64625 22394 39622 79085
53 46299 13335 1218O 16861 38043 59292 62675 63631 37020 78195
54 22847 47839 45385 2328'1 47526 54098 45683 55849 51575 64689

55 41851 54160 92320 69936 34803 92479 33399 71160 64777 83378
56 28444 59497 91586 95917 68553 28639 06455 34174 11130 91994
57 47520 62378 98855 83174 13088 16561 68559 26679 0623~ 51]54
58 34978 63271 13142 82681 05271 08822 06490 44984 49307 61717
59 37404 80416 69035 92980 49486 74378 75610 74976 70056 15478

60 32400 65482 52099 53676 74648 94148 65095 69597 52771 71551
61 89262 86332 51718 70663 11623 29834 79820 73002 84886 03591
62 86866 09127 98021 03871 27789 58444 44832 36505 40672 30180
63 90814 14833 08759 74645 05046 94056 99094 65091 32663 73040
64 19192 82756 20553 58446 55376 88914 75096 26119 83898 43816

65 77585 52593 S6612 95766 10019 29531 73064 20953 53523 58136
66 23757 16364 05096 03192 62386 45389 85HZ 18877 55710 96459
67 45989 96257 23850 26216 23309 21526 07425 50254 19455 29315
68 92970 94243 07316 41467 64837 52406 25225 51553 31220 14032
69 74346 59596 40088 98176 17896 86900 Z0249 77753 19099 48885

70 87646 41309 27636 45153 29988 94770 07255 70908 05340 99751
71 50099 71038 45146 06146 55211 99429 43169 66259 97786 59180
72 10127 46900 64984 75348 04115 33624 68774 60013 35515 62556
73 67995 81977 18984 64091 02785 27762 42529 97144 80407 64524
74 26304 80217 84934 82657 69291 35397 98714 )5104 08187 48109
75 81994 41070 56642 64091 31229 02595 1351) 45148 78722 30144
76 59537 34662 79631 89403 65212 09975 06118 86197 58208 16162
77 51228 10937 62396 81460 47331 91403 95007 06047 16846 64809
78 31089 37995 29577 07828 42272 54016 21950 86192 99046 84864
79 38207 97938 93459 75174 79460 55436 57206 87644 21296 43393

80 88666 31142 09474 89712 63153 62333 42212 06140 42594 43671
81 53365 56134 67582 92557 89520 33452 05134 70628 27612 )3738
82 89807 74530 38004 90102 11693 90257 05500 79920 62700 43325
83 18682 81038 85662 90915 91631 22223 91588 80774 07716 12548
84 63571 32579 63942 25371 09234 94592 98475 76884 37635 33608

85 68927 56492 67799 95398 77642 54913 91583 08421 81450 76229
86 56401 63186 39389 88798 31356 89235 97036 32341 33292 73757
87 24333 95603 02359 72942 46287 95382 08452 62862 97869 71775
88 17025 84202 95199 62272 06366 16175 97577 99304 41587 03686
89 02804 08253 52133 20224 68034 50865 57868 22343 55111 03607

90 08298 03879 20995 19850 73090 13191 18963 82244 78479 99121
91 59883 01785 82403 96062 03785 03488 12970 64896 38336 30030
92 46982 06682 62864 91837 74021 89094 39952 64158 79614 78235
93 31121 47266 07661 02051 67599 24471 69843 83696 71402 76287
94 97867 56641 63416 17577 30161 87320 37152 73276 48969 41915

95 57364 86746 08415 146~1 49430 22311 15836 72492 49372 44103
96 09559 26263 69511 28064 75m 44540 13337 10918 79846 54809
97 53873 55571 00608 42661 91332 63956 74087 59008 47493 99581
98 35531 19162 86406 05299 77511 24311 57257 22826 77555 05941
99 28229 88629 25695 9493Z 36721 16197 78742 34974 97528 45447
$41

TABLE A 2
Oit.OINATES OF THE NORMAL CURVE

Second decimal place in Z


Z
0.00 om om 0.03 0.04 0.05 0.06 0.Q7 0.08 0.09

0.0 0.3989 0.3989 0.3989 0.3988 0.3986 0.3984 0.3982 0.3980 0.3977 0.3973
0.1 .3970 .3965 .3961 .3956 .3951 .3945 .3939 .3932 .3925 .3918
0.2 .3910 .3902 .3894 .3885 .3876 .3867 .3857 .3847 .3836 .3825
0.3 .3814 .3802 .3790 .3778 .3765 .3752 .3739 .3725 .3712 .3697
0.4 .3683 .3668 .3653 .3637 .3621 .3605 .3589 .3572 .3555 .3538

0.5 .3521 .3503 .3485 .3467 .3448 .3429 .3410 .3391 .3372 .3352
0.6 .3332 .3312 .3292 .3271 .3251 .3230 .3209 .3187 .3166 .3144
0.7 .3123 .3101 .3079 .3056 .3034 .3011 .2989 .2966 .2943 .2920
0.8 .2897 .2874 .2850 .2827 .2803 .2780 .2756 .2732 .2709 .2685
0.9 .2661 .2637 .2613 .2589 .2565 .2541 .2516 .2492 .2468 .2444

1.0 .2420 .2396 .2371 .2347 .2323 .2299 .2275 .2251 .2227 .2203
1.1 .2179 .2155 .21l1 .2107 .2083 .2059 .2036 .2012 .1989 .1965
1.2 .1942 .1919 .1895 .1872 .1849 .18:!6 .1804 .1781 .1758 .1736
I.J .1714 .1691 .1669 .1647 .1626 .1604 .1582 .1561 .1539 .1518
1.4 .1497 .147~ .1456 .1435 .1415 .1l94 .1374 .1l54 .1334 .IlIS

115 .1295 .1276. .1257 .1238 .1219 .120(/ .1182 .1163 .1145 .1127
1.6 .1109 .1092 .1074 .1057 .1040 .1023 .1006 .0989 .0973 .0957
1.7 .094U .0925 .0909 .0893 .0878 .0863 .0848 .0833 .0818 .0804
1.8 .0790 .0775 .0761 .0748 .0734 .0721 .0707 .0694 .0681 .0669
1.9 .0656 .0644 .0632 .0620 .0608 .0596 .0584 .0573 .0562 .0551

2.0 .054U .0529 .0519 .0508 .0498 .0488 .0478 .0468 .0459 .0449
2.1 .044U .0431 .0422 .0413 .0404 .0396 .0387 .0379 .0371 .0363
2.2 .0355 .0347 .0339 .0332 .0325 .0317 .0310 .0303 .0297 .0290
2.3 .0283 .0277 .0270 .0264 .0258 .0252 .0246 .0241 .0235 .0229
2.4 .0224 ·.0219 .0213 m08 .0203 .0198 .0194 .0189 .0184 .0180

2.5 .0175 .0171 .0167 .0163 .0158 .0154 .0151 .0147 .0143 .0139
2.6 .0136 .0132 .0129 .0126 .0122 .0119--. .0116 .011l .0110 .0107
2.7 .0104 mOl .0099 .(J096 .(J093 .0091 .0088 .0086 .0084 .0081
2.8 .0079 .0077 .0075 .0073 .0071 .0069 .0067 .0065 .0063 .0061
2.9 .0060 .0058 .0056 .0055 .0053 .0051 .0050 .0048 0047 .0046

First decimal place in Z


Z
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9

3 0.0044 0.0033 0.0024 0.0017 0.00]2 0.0009 0.0006 0.0004 0.0003 0.0002
4 .0001 .0001 .0001 .0000 .0000 .0000 .0000 .0000 .0000 .0000
548 Appendix Tobles

TABLE A 3
CUMUl AliVE NORMAl.. FREQUENCY DISTRIBUTION
(Area under the standard nonnal curve from 0 to Z)

Z 0.00 0.01 0.Q2 O.oJ 0.04 0.05 0.06 0.07 0.08 0.09
..------.
0.0 0.0000 0.0040 0.0080 0.0120 0.0160 0.0199 0.0239 0.0279 0.0319 0.0359
0.1 .0398 .0438 .0478 .0517 .0557 .0596 .0636 .0675 .0714 .0753
0.2 .0793 .0&32 .0871 .0910 .0948 .0987 .1026 .1064 .1103 .1141
0.3 .1119 .1217 .1255 .1293 .1331 .1368 .1_ .1443 .1486 .1517
0.4 .1554 .1591 .1628 .1664 .1700 .1736 .1712 .II!OII .1844 .1879

0.5 .1915 .1950 .1985 .2019 .2054 .2088 .2123 .2157 .2190 .2224
0.6 .2251 .2291 .2324 .2351 .2389 .2422 .245-1 .2486 .2517 .2549
0.7 .2580 .2611 .2642 .2673 .2704 .2734 .2764 .2794 .2823 .2852
0.8 .2881 .2910 .2939 .2967 .2'195 .3023 .3051 .3078 .3106 .3133
0.9 .3\59 .311«> .3212 3238 .326<1 .32&9 .33\5 .3340 .3365 3389

1.0 .3413 .3438 .3461 .3485 .3508 .3531 .3554 .3577 .3599 .3621
!.1 .3643 .3665 .3686 .3108 .3729 .3749 .3770 .3790 .3810 .3830
1.2 .3849 .3869 .3888 .3907 .J92S .3944 .3962 .3980 .3997 .4015
!.3 .4032 ..049 .406fl .4082 .4099 .411 5 .4131 .4147 ,4162 .4177
1.4 .4192 .4207 .4222 .4236 .4251 .4265 .4279 .4292 .4306 .4319

1.5 .4332 .4345 .4351 .4370 .4l82 .4394 .4401> .44\& .4429 .444\
1.6 .4452 .4463 .4474 .4484 .4495 .4505 .4515 .4525 .4535 .4545
1.7 .4554 .4564 .4573 .4582 ,4591 .4599 .4608 .4616 .4625 .4633
1.8 .4641 .4649 .46,6 .4664 .4671 .4678 .4686 .4693 .4699 .4706
1.9 .4713 .4719 .4726 .4732 .4738 .4744 .4750 .4756 .4761 .4767

2.0 .4772 4778 .4783 .4788 .4793 .4798 .4803 .4808 .4812 .4817
2.1 .4821 .4826 .483Q .4834 .4838 .4842 .4846 .4850 .4854 .4857
2.2 .486\ .4864 4~68 481\ .4875 .4818 .4881 .48&4 .4881 .4890
2.3 .4893 .4896 .4898· .4901 .4904 .4906 .4909 .4911 .4913 .4916
2.4 .4918 .4920 .4922 4925 .4921 .4929 .4931 .4932 .4934 .4936

2.5 .4938 .4940 .4941 .4943 .4945 .4946 .4948 .4949 .4951 .4952
2.6 .4953 .4955 .4956 .4957 .4959 .4960 .4961 .4962 .4963 .4964
2.1 4965 .4966 .4961 .4968 .4969 .4970 .4971 .4972 .4973 .4974
2.8 .4914 .4975 .4976 .4917 .4971 .4978 .4979 .4979 .49&0 .4981
2.9 .498\ .4982 .4982 .4983 .4984 .498. .4985 .4985 .4986 .4986

3.0 ,4987 .4981 .4987 .4988 .4988 .4989 .4989 .4989 .4990 .4990
3.1 .4990 .4991 .4991 .4991 .4992 .4992 .4992 .4992 .4993 .4993
3.2 .4993 .4993 .4994 .4994 .4994 .4994 .4994 .4995 .4995 .4995
3.3 .4995 .4995 .4995 .4996 .4996 .4096 .4996 .4996 .4996 .4997
3.4 .4997 .4997 .4991 .4997 .4991 41J97 .4997 .4997 .4997 .4998

3.6 .4998 .4998 .4999 .4999 .4999 .4999 .4999 .4999 .4999 .4999
3.9 .5000
TABLE A 4
THf DI~l"lu"lmo~ or ,. (TW<rTA.ILED TESTS)
=:._-- ~_---
,--- -
Degrees Probabilit) of a Larger Value. Sign Ignored
of
; 0~200 O~IOO 0.050 0~025 0~01O 0~OO5 O~OOI
Freedom 0.500
_- -0.400 -. ~

I 1.000 I ~376 I 3~078 6.314 12.706 25~452 63~657


2 0.816 1.061 1.886 2~920 4~303 6~205 9~925 14~089 31.598
0.978 1.638 2.353 3~ 182 4~176 5~841 7~453 12.941
3
4
5
~765
.741
~727
.941
.920
1.533
1.476
2~ 132
2.015
2.776
1.571
3~495
3~ 163
I 4~604
4~O32
I 5.598
4~773
8~61O
6~859

~718 .906 11.440 11}4;\ 2.447 2.969 3~707 4~317 5~959


6
; ~71\ $% 11.415 \%9" 2.~5 2.%41 3~4'l9 4~O29 5~405
~706 ~889 ' 1.397 1.860 :dllO 2.752 3.355 3~832 5~041
8
.883 1.383 1.833 2.262 2~685 3.250 3.690 4~ 781
9 ~ 703
10 .700 ~879 1.372 1.812 2.128 2.634 3~ 169 3.581 4.587

II .697 .876 1363 1.796 2.201 2.593 3~106 3.497 4.437


12 .695 .871 }.356 1.782 2,IN 2.560 3.055 ),428 4~318
.694 JPO 1350 1.771 2160 2.533 3.012 JJ72 4~221
13
14 .692 .868 1.345 1.761 2.145 1.510 2~977 ).326 4~140
15 .691 ,866 1.34! 1. 753 2. 1.11 2.490 2.947 3~186 4~073

16 .6,)() .865 I 337 I. 74tJ 2 120 1,473 2.921 J~252 4~015


17 .689 ~863 I ~333 1.740 2,110 2.458 2.898 3.222 3.965
IS .688 .861 J.330 1734 2.10\ 2.445 2.S7S ).197 )~922
19 .688 .8¢1 1.32H I. 729 2.1J<!3 2.433 2~861 3,174 3.883
20 .687 .860 1.325 1.71S 2086 1.423 2~84S ll53 l850

21 .686 ~85. 1.123 1.721 2,ORO 2,414 2.831 3.135 3.819


22 .686 ~858 1.311 1.717 2.074 2,406 2.819 3. Jl9 3.792
23 .685 ~858 1.319 I. 714 2 U(1) 2.39H 1.807 3.104 3.767
24 .685 .857 1.318 1.711 2,064 2.391 2.797 3~090 3.745
25 .684 856 1.316 1,7ox ~ 06() 2.385 2.787 3,078 .l725

26 .684 .856 1.315 I.JOt. :!.OS6 2.379 2~779· 3~067 3.707


27 ~684 .8;5 !.l14 I. 7(13 2.052 2.373 2~ 771 3~0S6 3.690
28 .683 .855 un 1.701 2,048 2~368 2~763 3~047 3~674
29
30
.M:'
.683
~854
.854
UII
!.lID
1.69')
1.697
2J}45
2.042
2.364
2.360
I 2~756
2~750
3.038
3030
3.659
3.646

35 .682 .852 U06 1.690 2.1)30 2.342 2~724 2~996 }.5.,


40 ~681 .851 1.303 1.684 .1.021 2.329 2.704 2.971 3.551
45 .680 .850 1.301 1.680 2.014 2.319 2.690 2.952 3.520
50 .b80 .849 1.299 1.676 2~008 2~310 1.678 2.937 3.496
55 .679 ~849 1.297 1.673 2.004 2.304 2.669 2.925 3,47fl

60 .679 ~84% 1.2% 1.671 2~OOO 2.299 '.660 2.915 3.~l


70 ~678 ~847 1.294 1.667 1.994 2.290 2.648 2.899 3.435
80 ~678 ~847 1.293 I.M5 1.989 2~284 2,638 2.810 3.4Ifl.
90 ~678 ~846 1.291 1.662 1.986 2.279 ) 2.631 2.818 3.402
100 ~677 .846 1.290 1.661 1.982 2.276 2.625 1.871 3.390

120 ~677 .845 1.289 1.658 1.980 2~270 2.617 2~860 3~373
xc .6745 ~8416 1.2816 1.6448 1.9600_ 2~2414 2.5758 2~8070 3.2905
-_~

• PartS Oflhls table are reprmted by permJsslon from R. A. Fisher's StalisTlcal Method!
for Research W')rker~, published by Oliver and Boyd, Edinburgh (192'> 1950): from Maxine
Merrington's "Table of Percentage POints ofthe I·DlstributlOn." Biometrika. 32: 300 (1942);
and from Bernard Ulotle's Slatistin in Re.feurch. Iowa State University Press (1954).
550 AppenJi. Tables

~
gg~:~~ 0''''''-
""''lIO'''O'O-
r€~&J~~i ~~~~8
~
. " l"~
~~~:!:e 06ci....:"",Y"i
_t"1Nt"~N ~~~~~ ;1.~~~i
"
I" """_~QQO-
~ f"'! <""! .... ! C! ;;C;~~~~ ~~$:!~ s;c.--o-r--
_"')0-'"
~ ...ox:::"":"';
"
-O~=:::~ --"'I . . . . . ~~~~~ N....;~'O~
("'\ .......... ~ .....
.~------.~
--
~ NCI(IV)""'t"l V'I_I'<')NX N "''''T N 0- oI"lO'f'"'Iot'lr-.
(JO_VlClO_
,c.o....: N..,.
r> =!"':""'!-:~ ""010')0"" O":"'!".-:~
____ N
"'" V'l,....O-::~ ",,:.o~a:o - ..... .., .&J r-
.... JNN~N t',,,,".""''''''f'''I
:;! ~~«!~S!
O\t'-_I"'I_
"'=.,,0- ..... ::w;>f""I.o:)()ii:
>C.::o .... ,..c _ ~~r;:!;;
"" "·'110'" r-- 0-.:: ~:!~~~ ~"":N"':'vi
_NN,..I"", ~~~~~

--""QC~ ~l::g:~:;
~ r--oDNt-N ~a~~~ ri~~~~
c; I"'i...;..o""':o' 2~:!:!~ .....:~~...;N
___ NN "";"':v-i""':xi
MNMNN

1I
• ;; $( 1"'<1 ..... _0-""
~ ......... -,....04) ~~~:p~~
OV"l:x>MII"'I
r- Xl 0- _ 1", ~,~~~~
•-< ;. N
Q
",,:t-t.,fvi.c ..... O'~=:!
-----
,..;",:.n"":xi q: 6"': I"'if"'i
_NNNN
~ li
~ ~
i 0 .,..,~,....",on ..... 11'1..,...,...,.
:" ';!. it. ~ ~ ..,...,. '""',..,. ...
'-' ~ ~ "', '""1 r: "'"\ '""1 "'"I1""lf""', ,........., f '................. f""l

5 '0 O-N""'''''' "';...ci"",:.,o",


~=~::!:! ~:£~~~
'"« ~ .f'] -":;!+COO_Nr--
I
-

...
'".,-'« !;K -"0• .... 0-:V!~~~ ~~:;~;'!
,....j"":"";vi.o
~~~~~
""::C"'~=
~~~~~
=~::;:!:::!
;- ,.
Q
ct: = 0--"'"

~IN-~8- 0 . - . 0 ' ......


t"1 00 ..". _
1'"'-
oe ~~~~~ ;;;g~'2;

IW"~'
~
>
;:: ,..iN~-.i-.i V'I>,Cr-r-oo ~~~:::!
<
._---- -
i j
"
U I
Qti'"I-1,()
-.f'1t:;:-:
QClO-
"'1"r-- .......... .".
>.O .... t- ...... O-
"":,....jN..-i"";
.r- ..... o,r-..c;
tI'\ N:X '"
"':<rivi..or--:
t",j,
".:: ......
-=-
'-:xo-:::!~
~NV'l
-0 " " - ~ I
I
_..c ...... _,.
....
~ • VOl N
• Of"'l""
CIO ......
gO
~"':)()Otl'\
N..o_r-N ~~~~~ ".._ .....~ ~ "': "'!
~ :ddd~ "';'-:N,....jM r-,,,,,V'\Vl>Cl ".::r-x;:Xla-

_ _ _ '-.>0

.
~ :S=~~
:eidoo

'_r- __
~~:2g~
O"":"";NN
~~=~~
.......... ..,. 'V Vl
.::.e: o:t: C! 'oC!,..!
Vl\O ..... r-OIO

.
~

8: 'ocr".
: c::i c:i d 0
~~~~~
CO _ _ N i~c:.~i
,..,...-i""';"':"':
.0-0"""'"
_ r- f"1
.,-i...-i_Q-O"":
gC .".

~ E

U _NfO'I .... V. -0 .... 1100.2


=~=:!!: ~!::~~~
551

I I
I
~~g;~~ ~~~~~~~
~
~~~~~
I[ I~ 0 ;;~Li~~ ~a-ON""I
~ .,.,V\~
~~;:g=~~
!

I !~ l;~~~~ ..c_..., .....


0..,.,001"">1 .... ""_
:;~l~:::\ _oo
0 ~g;~~ ::j~~~~ "":-C o08 M ";"";
-¢f"-.QCI_=!:::!~
I I--- t---- --~-

,~
,;
18 I~~~~~ ~~:i~~ iJ;~~$~:!;~
I""
I&;
I~~~~$f

~~~r;;~
:;~'$':j*
-------- -
~=~~~
~~~~~=~
-----
;eS::!:ri::!~
- -
1..
~- ----
~~~~~ ~~;;~; .,;r--:~g...:,....;..; '0
..... -c ~=~ N

--_-
~
----------
~~~g~ $~~~~~~
~ I~;~~,;
'N--O~

~;sj:;';~~ v;~~~~g~ .s
u "" -
~
0

; >"
• ;;; &;
.... :;:!:!~;S;
~1(~~~
;~~r::~
;;:~~~,;
~~~~::!!:!
~.1t:8:::ggsi§
0
-~

~~
=- l!
~ ""
I~I~~~~~
1
~a 0 ;l;;J;~~~ ~f"'I""",",f"'I""'f"'I
~ ............... r<\ .... $
"";'(:ir-:ocio'
i ~~e:;$~~~
~
,;
i~ IQf~N~~~
<.J!j
Tf: II ]
~
0
.~
!..,.
~NNN,""

-- ae
~:"It" . . . . :~~&;~ ~;!I:~I2:!~::;
""~ • I"':'"'!""":' ,0';
I r-
ie)~::~~~ ei"':Nf"'i" ~~~~;::g~ ~
..."2
NMNNN
<:! I[ ::I]
... lO
.... Q
---l----
"'~ ~j~!~:gr.t ~=;::::'$ ~$~~~~~ -0
8
<>
... >=
< I o
. ! ..-\...,:..;s:..,..; '"
_____ r--=oOoO~O
_ _ _ _ "l
~:;;~~:i~~ oS•
;j <3
:0
=- &;1~~g)::;::C C!OV'lI"'I-Qo.
~-O't-..,.
... \Qa-..,.o,f'O'I ....
Vir-_r- .... _o. $
U CI'>
. I-N . . .
0" ____
J
fO"l _
•••••
.... ~~~t::~ ~~;:ri$$r:: ~
,I
---- ------ -- ---- ------ ...;;II,
~!~~$i~
i c:::i d _.: t"i'-: ~~~~~ ~~~;e~~~
II~ : -----
0':: :::!: :!,~
-, ~ ~f4$j!~c;~;! ~
I , t---
j, -,~------- .~

I~ ,I~~~;£:;':: ~~~~~ ~r::~:J:~~~ <C


I oCO:~E= ~:!:::!:! "IQ\ ..... .,..M-O
"INI""I.".vt>,C ..... -5
I
0
I :;
-~--
":0
Q
8: 18l~~~
~
.....
00000'0'0
::!;;~~~
....;....:~MM
-----
r::8:~~!::~~
or-:vi ..... ....:a:.r-..:
NN .... .". .......... "O
.'!
E
0
-- '"0"
~ • c
0 e0 "
."
)1

,
it
l
,!l
"0
!! !
"-
_M .... ..,.. .....
NNMNN
-0"'" QIQ 0' 0
Nt'P ..·"M .... ~,,~g!i6~~ ;:;=

II
552 Appendix Tabl.,
TABLE A 6
(i) TABLE FOR TtsTING SKEWNfSS
(One-lajJed pertentage points of the distribution of Jb r = KI = mj/m z 1 11 ).

Size 01 IPercentage Points : II


Size of : PtTl:entage Points
Sample Standard II Sa~ple ~-. - - Standard
• l)eviati~~~l~~ Deviation
--2-5-+-0-.1-1I--I,-(}6-I-+I, -0-.4354 If 100 i O.3~q 0.561 0.23"71-
30 0.662 0.986 .4052~, 125 ' 0 350 0.50~ .2139
35
40 1
0.621
0.587
0.923
0.870
I .3804
.3596
;,' [' ISO
175
0.321
0.29"
0.464
0.430
.1961
.1820
45 0.558 0.825 .3418!. 200 0.280 0.403 .1700
SO 0.534 0.787 .3264
250 0.251 0.360 .1531
60 0.492 0.723 .3009 300 0.230 0.329 .1400
70 0,459 0.673 .28(}6 3SO I O_~13 0.305 .1298
80 0.432 0.631 .2638 400 O.~OO 0.285 .1216
9(} 0.409 0.596 .249S 4SO 0.188 0.2&9 .1147
100 I 0.389 0.567 .2377 SOO 0.179 0.255 .1089

• Since the distribution of ,/h, is symmetrical about zero, the percentage points repre-
sent 10010 and 2% twcHailed values. Reproduced from Tabte 34 8 of Tabit'lfor Srar;sficUuu
ami BiomE-tridans, Vol. 1. by permission of Dr. E. S. Pearson and the Bionwrrika Trustees.

TABLE A 6--(Con,inlledl
Iii) r"lIU' FOR TESTING KUJtTOSIS
(Percentage points of the distribution of b} = m,/m,'r

s;.., of
Sample Upper
t%
~ntage
..
Upper
5%
.
Points
L.,_
S%
JI' i
Size of )--- - - --
Lower \ Sample: Upper Upper
t% ~i" l~o
PercenUt~e

S%
.
Pomts

i S,%
Lower lowcr
-1%
" ,
SO U8 3,99 2,15 1.95 600 3.54 3.34 2.70 2.60
75
100
125
4.59
4.39
4..24
3.87
3.71
3,71
2,21
2.35
2.40
2.08
2.18
2.24
I 6SO
700
750
3.52
3.SO
3.48
3.33
3.31
3.30
2.71
2.72
2.73
2.61
2.62
2.64
ISO 4.13 3.65 2.45 2.29 800 3.46 3.29 2.74 2.65
8.50 3.45 3.28 2.74 2.66
200
2SO
3.98
3.87
3.57
3.52
2.51
2.55
2.37
1.41
I 900 3.43
950 i 3A2
3.28
3.27
2,75
2,76
2.66
2.67
300 I 3.79 3.47 2.59 2.46 1000 3.41 3.26 2,76 2.68
3SO 1 3,72 3.44 1.62 2.SO
400 3.67 3.41 2.64 2.52
\
1200 I 3.37 3.24 2.78 2.71
4SO 3.63 3.39 2.66 2.55 1400 I 3,34 3.22 2.80 2.72
,

=1
500 3.60 3.37 2.67 2.S7 1600 I l.32 3.21 2.81 2.74
5SO 3.57 3.35 2.69 2.58 3.30 3.20 2.82 2.76
600
,

I
3,54 3.34 2.70 2.60 3.28 3.18 I 2.83 2.77

• Reproduced,from Table 34 C of Tables for Stallstil"ia/U and 8iomelricio"s, by permis·


sian of Or. E. S. Pearson and ~hc Biometrika Trustees.
553

TABLE A 7
(i) SICNJFJCANCE lEvELs. Of ' .. "" {X - ,u)/W IN NOIlMA.l S""_PLES. TWO-TAIUO TESt.
O)VJDE P BY 2 FOR A ONE· T "'L.ED Ti!ST·

Probability P
Size of
Sample 0.10 0.05 0.02 0.01

2 3.157 6.353 15.910 31.828


3 0885 1.304 2.111 3.008
4 .529 0.717 1.023 1.316
5 .388 .S07 -0.685 0,843

6 .312 .399 .523 .628


7 .263 .333 .429 .S07
8 .230 ,288 .366 ,429
9 .205 .255 ,322 .)74
10 .186 .230 .288 .333'

II .170 .210 .262 ,302


12 .158 .194 .241 .277
13 .147 .181 .224 .256
14 .138 .170 .209 .239
15 .131 .160 .197 ~24

16 .124 .151 .186 ,212


17 .118 .144 .177 .201
18 ,113 ,137 .168 191
19 .108 .131 .161 .18~
20 .104 .\~6 .154 .175

• taken from more extensive tables by permission of E. Lord and the Editor of Bio-
merrtka.
554 App...Jix Table.

·TABl.E A 7-(Contillued)
(ii) SIGp.!IFIC .... NCE LF. ....:LS OF !.'\ I - X l)PI2(WI + H/l) toM Two NOK~to\L
S .... MPLES Of EQUAL SIZES," TWO-TAILED TEST.

Probability p
Size of
Sample 0.10 O.OS 0.02 0.01
2 2.322 .1.427 5553 7.916
3 0.974 1.272 J.71S 2.093
4 .644 O.RI3 1.047 1.237
5 .493 .6/3 0.772 0._

6 .405 .499 .621 .114


7 .347 .426 .525 .600
& .3a6 .373 .459 .521
9 .275 .334 .409 .464
10 .250 .304 .371 .419

II .233 .2&0 .340 .3&4


12 .214 .260 .315 .355
13 .201 .243 .294 .331
14 .189 ·72& .276 .31 I
15 .179 .216 .261 .293

16 .170 .205 .:!47 .278


17 .162 .195 .236 .264
'IS .m .1&7 .225 .252
19 .149 .179 .216 .242
20 .143 .172 .201 .232

• From more extensive tables by permission of ~. Lord and the Editor of BioIMITilca.

TAlllE .4 j
NUM8I:.RS OF LIKE SIGNS RE:QU1MED FOR SU'>NIFJeANCf. IN l"Hl: SIGN TEST,
WITH Ani.JAL SIGNIFICANCE PkOBAB1LlTIES. TWO-TAILED TESr

No. of Significance Level No. of Significance Level


Pairs 1% 5% 10";' Pairs 1% 5% 10"10

5 ...... . ... 0(.062) J3 1(.003) 2(.022) 3(.092)


6 ...... 0(.031) 0(.031) 14 1(.002) 2(.01 )) 3(.057)
7 .... 0(.016) 0(.016) 15 2(.007) 3(.035) 3(.035)
8 0(.008) 0(.008) 1(.070) 16 2(.004) 3(.021) 4(.077)
9 0(.004) 1(.039) 1(.039) 17 2(.002) 4(.049) 4(.049)
10 0(.002) 1(.021) 1(.021) 18 3(.008) 4(.031) 5(.096)
11 0(.001 ) 1(.012) 2(.065) 19 3(.004) 4(.019) 5(.063)
12 1(.006) 2(.039) 2(.039) 20 3(.003) 5(.041) 5(.041)
I
555
TABLE A 9
SUM Of RANKS AT AppllOX1MAn 5",~ ANV 1~~ UVEL5 OF p,'" THESE NUMBI.RS
OR S)f.U.lER JI'IDICATE RV.£CTJON. TW()-T,.\ILfD TEST

Number of Pairs 1% Level

7 2(0.047) 0(0.016)
8 2(0.024) 0(0.008)
9 6(0.054) 2(0.009)
10 8(0.049) 3(0.010)
II ) 1(0.05)) 5(0.009)
12 14(0.054) 7(0.009)
13 17(0.050) 10(0.010)
14 21(0.054) IJ(O.Oll )
IS 25(0.054) 16(0.010)
16 21110.05)) 11110.009)

.. The figures in pareoEheses ate the actual sipiftcaoce probabilities. Adapted from
the article by WilcolI,on 12. Chapter 5).
TABLE A 10
WILCOXON'S Two-SAMPLE RANK Tesr am
MANN-WHITNEY TEST}.
V ALUF3 Of T flo T Two L£VELS
(These values or smaller cause rejecUon, Two-tailed lcst. Take n I :5 nl·)

0.05 Lew! of T

4 10
5 6 II 17
6 7 12 18 26
7 7 Il 20 27 36
8 3 8 14 21 29 J8 49
9 ) 8 IS II 31 40 51 63
10 3 9 U 23 32 42 5316578
II
12
4
4
9
10
16
17
24
26
14
44 55
46 58
35
681
71
81
85
96
99 115
Il 4 10 18 27 37
48 60 73 l1li 103 119 I 137
14 4 II 19 28 38
SO 63 76 91 106 , 123 '141 160
IS 4 11 20 29 40 52 65 79 94 110 127 I 145 164' 185
16 4 12 21 11 \42 54 67 82 97 114 . III ISO 169
17 5 12 21 12 43 56 70 84 100 117 il35 : 154 '
18 5 13 22 33 '\45 58 72 81 103. 121 '139
19 5 13 23 14 46 60 74 90 107 124
20 5 )4 24 35 48 62 n 93 110
21 6 14 25 37 50,64 79 95 ,
22 6 15 26 38 51 66 82
23 6 15 27 )9 1 53 68 I,
24 6 16 28 40 \55
25 6 16 2~ 42
26
27
7 17 29 I I
; \ 17
28
I
556 App.nd.1I TobI.,

TABLE A 1000Ccrt/i"wJ)

0.01 Lc:vel 0( T
n . ... l
"1 2 :\ 4 5 6 7 8 9 10 II 12 13 14 IS
l

s- IS
6 10 16 23
7 10 17 24 32
8 II 17 2S 34 43
9 6 II 18 26 35 45 56
10 6 12 19 27 37 47 58 71
II 6 12 20 28 38 oW 61 7. 87
12 7 13 21 30 40 SI 63 76 90 (06
13 7 14 22 31 41 53 65 ~ 93 109 125
14 22 32 43 54 67 81 96 112 129 147
14
IS
7
8 IS 23 33 44 56 7() 84 99 liS Jl3 lSI 171 ..
16 8 15 24 34
']6
~ 58 n 8(1 102 H9 m ISS
17 8 16 2S 47 60 74 89 lOS 122 140
18 8 16 26 37 0111 62 76 92 lOS 125
19 3 9 17 27 .lII SO 64 78 94 III
20 3 9 18 21 19 52 ft6 81 97
21 3 9 18 29 40 53 6IS 83
22 3 10_ 19 29 41 5S 70
23 3 10 19 JO 43 51
2. 3 10 ~ 31 44
25 3 II 20 32
26 3 11 21
27 4 11
21 4
• "I and " 1 are the Dumflers of ases in the two groups. If the JI'OUPS are unequal in
size. "I refers CD the amaller.
Table is reprioted from While (12, Chapter S). who eJltended the IDetbod ofWiJcoltOG.
SST

TABLE A II
CORIl£LATION COEFFtCl£NTS AT THf sa;. AND 1% LEvELS Of SIGNIFICANn
Dearees of Degrees of
Freedom

I
'.
5',

.997
Ii'o
1.000
Freedom

24
S~;.

.388
1%
.4'16
2 .950 .990 25 .381 .487
3 .878 .959 26 .374 .411
4 .811 .917 27 .367 .470
5 .754 .874 28 .361 .463
6 .707 .834 29 .355 .456
7
8
.666
.632
.798
.765
II
30
35
.349 .
.325
.449
.411
9 .102 .735 <40 .304 .)93
10 .576 .708 45 .288 .372
II .553 .68<4 50 .213 .354
12 .532 .661 10 .250 .125
13 .514 .641 70 .232 .J02
14 .497 .623 !Ill .217 283
15 .482 .606 9Q .205 .267
16 .468 .59Q 100 .195 .254
17 .456 .575 125 .174 .:!2I
18 .. 444 561 ISO .159 .208
19 .433 .549 200 .138 .181
20 .423 .537 JOO .113 .148
21 .413 .516 400 .0'18 .128
22 .404 .515 SOO .0Ii8 .1"
23 .3% .S(.I) 1,000 .0()2 .1lI1
,
Portions: of this labie wrre taken from Table VA in SIoJi51icaJ M~/Jtodsfor b~
WOI'brs by pennission of Professor R. A. Fisher aDd his publishers. OH..-er aod Boyd.
558 Appenrlix Tabl••

TABLE A 12
TABLE OF : = ! LOG.. (I + r)/( I - r) TO TRANSFORM TlU CORlt.ELATION COEfFICIENT

r 0.00 0.01 0.02 0.03 0.04 0.05 0.06 0.Q7 0.08 0.09

.0 0.000 0.011) 0.020 0.0)0 0.040 0.050 0.060 0.070 0.080 0.090
.1 .100 .110 .121 .131 .141 .151 .161 .172 .182 .192
.2 .203 .2)) .224 .234 .245 .255 .266 .277 .288 .299
.3 .310 .321 .332 .343 .354 :365 .377 .388 .400 .412
.4 .424 .436 .448 .460 .472 .485 .497 .510 .523 .536
.5 .549 .563 .576 .590 .604 .618 .633 .648 .662 .678
.6 .693 .709 .725 .741 .758 .775 .793 .811 .829 .848
.7 .867 .887 .908 .929 .950 .973 .996 1.020 1.045 1.071
.8 1.099 1.127 1.157 1.188 1.221 1.256 1.293 1.333 1.376 1.422

r 0.000 0.001 0.002 0.003 0.004 0.005 0.006 0.007 0.008 0.009

.90 1.472 1.478 1.483 1.488 1.494 1.499 1.505 1.510 1.516 1.522
.91 1.528 1.533 1.539 1.545 1.551 1.557 1.564 1.570 1.576 1.583
.92 1.589 1.596 1.602 1.609 1.616 1.623 1.630 1.637 1.644 1.651
.93 1.658 1.666 1.673 1.681 1.689 1.697 1.705 1.713 1.721 1.730
.94 1.738 1.747 1.756 1.764 1.774 1.783 1.792 1.802 1.812 1.822
.95 1.832 1.842 1.853 1.863 1.874 1.886 1.897 1.909 1.921 1.933
.96 1.946 1.959 1.972 1.986 2.000 2.014 2.029 2.044 2.060 2.076
.97 2.092 2.109 2.127 2.146 2.165 2.185 2.205 2.227 2.249 2.273
.98 2.298 2.323 .2.351 2.380 2.410 2.443 2.477 2.515 2.555 2.599
.99 2.646 2.700 2.759 2.826 2.903 2.994 3.106 3.250 3.453 3.800
SS9

TA8LE A 13
TABLE OF r IN TERMS OF Z·
---------
Z O.ClO 0.61 0.02 0.03 0.04 0.05 0.06 0.Q7 0.08 0.09
----- - - - - - - - ------
0.0 0.000 0.010 0.020 0.030 0.0-10 0.050 0.060 0.070 0.080 0.090
.1 .100 .110 .119 .129 .139 .149 .159 .16>< .178 .187
.2 .197 .207 .216 .226 .236 .245 .254 .264 .213 .282
.3 .291 .300 .310 .319 .327 .336 .345 .354 .363 .l7I
.4 .380 .389 .397 ..105 .414 .422 .430 438 .446 .454

.5 .462 .470 .478 485 .493 .500 .;01< .515 .523 .530
.6 .537 .544 .551 .558 .565 .572 .518 .585 .592 .598
.7 .604 .611 .617 .623 .629 .635 .641 .647 .653 .658
.8 .664 .670 .675 .680 .686 .691 .696 .701 .706 .711
.9 .716 .721 .726 .731 .735 .740 .744 _749 .753 .757

1.0 .762 .766 .770 .774 .778 .782 .786 .790 .793 .797
1.1 .800 .804 .808 .811 .814 .818 .lei -tC4 .818 .831
1.2 .834 .837 .840 .843 .846 .848 .851 .854 .856 .859
I.J .862 .864 .867 .869 .872 .874 .876 .879 .881 .883
1.4 .885 .888 .890 .892 .894 .896 .898 .900 .902 .903

1.5 .905 _907 .909 .910 .912 .914 .915 .917 .919 .920
1.6 .922 .923 .925 .926 .928 .929 .930 .932 933 .934
1.7 .935 .937 .938 .939 .940 .941 .942 .944 .945 .946
1.8 947 .948 .949 .950 .951 .'52 .953 .954 .954 .955
1.9 .956 .957 .958 .959 .960 .960 .961 .962 .%3-- .963

2.0 .964 .965 .965 .966 .967 .967 .968 .969 .969 .970
2.1 .970 .971 .972 .972 .973 .913 .974 .974 _915 .915
2.2 .976 .976 .977 .977 .978 .978 .978 .979 979 .980
2.3 .980 _980 .981 .981 .982 .982 .982 .983 .983 .. 983
2.4 .984 .984 .984 .985 .985 .985 .986 .986 .986 .986

2.5 .987 .987 .987 .987 .988 .988 .988 .988 .989 .989
2.6 .989 .989 .989 .990 .990 .990 .990 .990 .991 .991
2.7 .991 .991 .991 .992 ,992 992 .992 .992 992 .992
2.8 .993 .993 .993 .993 .993 .993 .993 .994 .994 .994
2.9 .994 .994 .994 .994 .994 .995 .995 .995 .995 .995

• r = (e h - l)/(e h + 1).
560

I
I:
....'I
'I
'

~ 11

~ ;1
~ !
..
~
~~ ,,:::; ~!J ~~
~~ ...;,.: ....:,n ...;,..;

;;s
,...;,c
~i!!i
,..;.,;
~~
"',.

~~
,....~
:;::
,..;.
::::~
,..;,,;
";~
"'''!'

~~ :~ $~ ~q ~~ ~~ ~~ 12;::; ~I ;;;I!!
~~ ~~ ~~ ~~ ~~ ~~ N~ "i,. N...; .... ffi

~~ ~~ ~~
~g ~~ ~~
~R
~~
~~
M~
~~
~.
~~
,..;~
Zi
,...;,n
~$
N,,!,
~~ l~
N~ N~
~~
N~

;!~ ;;J;; ~~ g~ r;;~


Xl~ "":t .,.0- ~r: "',.j

;E~
x:~
;:g
.,.;~
g, aJt
... ~ ..;.r-:
~;t
...;.,.;i

~~ ~~ ~~ SI ~~ ~; ~~ 8~ ~~ ~~ N~
x:~ ~~ ... : ~~ "';,c ,..;.,; ~.,.; ~ ... N,. N,. ~~

~;l
...;.
:;s
,..:;.
~~
""VI

!:;j!j M.1J ~8 S;:! ~:;:


i..... Net- .... ...;r:: "";..0 ...;.0
.;
.~

~ ~~

"" N~
gl

i- "':11 :j1~ .CI'


..... 2 ""!: ~ll! ....
;0::;
~-.---~--.
~!il
"'0-

M
O-2=:::!:!
561
:. :::
, ..::
~ :0 ~ !: ~ ;:; ::l ;I ~
:!i
I
" ~

I, " :!S
MfOi ~s
"'II -~

... '"
O~ n
~'" ~s lI!~
_N
::I~
~ . S~ :~ "'~ ::::1 ...: ...
~- ;:::!:;
...:,.,;
$:
~ ..
llt ~'! {2!l
~ N..; ~~ f::!J::
NN
~~
_N ~~ ~-
.W)
_N "':N _N ~~
_N
~'1
_N
~~
~ ... ~~
~ ... ...: ...
N..; ~~
~ "'8 ~~
NN NN
~~
~ .. ~~ ~~ S~ ~S ~~
~N
~::l ~t:1
~::I ~~
_N ...: ~ ... .
~-
~ ;::;::1 ~S 81
.... N ~~ ~~ ~!
~~
.W) :;;~
... .
::I>: _N ~J:!
~~ ~i!l
"'~ ..
-. .-.iN _N _N ~ ~ _N ..:...i ~
NN

:;;, ....
::1-

1
:08
~"
8~ ~'l S;:f)
I ~::l "'..; 8!~ :~ &:~ 1"1' 1
iN
~ NN NN NN _N _N ...: .... _N ...: ... .~
_N ....:,..;
~-
~~ ~~ ~S ;;~ -~
N~ ~~ ;!;~ .... ~~
_N ~~
&:~
~,
~ "':Ii
'" NN
~'"
~~
~ ... ..;. _N ...
...:

-. f
N~
...

~'".... 00.
::;~ N~ ~;;
-N
~:i! s.~ ~iI! :~ -~
~~
~~
~-
~ N..; M"; N~ NN NN NN _N ~ ...
~.,
_N _N ...: ....: ....

:. ~1Ii
N~
:'llil :'2 :o~
M"; N"'; ;::;~ N~ §~ !~ ...8;:!
" ~~ ~:!
_N ~~ ~~ ~~
I ;!; :::::;
N..;
i1:i!l ~~ ~!! :08
_N
-:~ ~I ail
........ ... '" '"
~~
O~ a~ in :s
.-:...i ...: .... ~~
~.

,,• " .-
Nooi NfO'i N~ ,...j<"'i NN NN
...

~:l .... ,.;


~~ 11l!l
N"';
~:!
"i,..j
~~ ~~ ~J;
N~ NN N~
u
NM
~~

;::~ ~~
N.
0:1";
NN
812
NN
~~
_N

51
• '"
;N
M~ :l;"
N";
8'1 ~!i
N"';
~. _N
N_
~:::l ,...;..;
,,~ :01 "'J 2C 8!~
N .... NN
"'"
8- 81':
... :l '"
-. -.
N~ "",.j NN

~
...
~12 ~:!i ~~

"s aN
2li
~'" ~%l
.~
:!;~
~~
:! ..... NN ;tl!: N_
~" -:~

~•
.-; "'..; N"; "'...; ...irol N"; NfOi NN NN NN
N~ N~
"'N

..,
~il ;;;~ ~a :0:11
"'~ .-iN
~~
~~
~~ N"'; 11::1 ~~
~'" ~"
N_ M
~ "'!"':
r-i"; M"; N"'; .-iN
:,g N~ N~ N~ N~ N~ r''':;

e0 - -,_
.~ -~ ~-
_N
~~ ~~ ~~ ~;! ~~ ~~ ~! ~!i!
.N

11 - I'i"';
",I"; OS:>C! -:~

.-
N~ N~ N~ N~ N~ N~ N"'; N"'; N~ N~ "'..; N~

! ~ 2 ~: ~i ~S ~Vj 'l~
~~ N_
:;:~ ~!:; .~
:::S
-< ''0 "'.-i N"; N";
::i1Ji
N"; N~ N~ ;:]::l ~~
N.., M,..j
NN
,.,j"'; M"; ;:;:; ,.q.-iOl

"' [
..J
~ ~a u ~l" Si~ ~~ ~:t! ~~ :q:!j ~>I g~ .-
N~
~~

;:J::;
'"....-< is.:: <"i"; ,...;..-i "";..-i N~ N~ N~
"""";<"i N~
,...;'" N~ N~

.I ~:!
oi";
~~
N. ~"
"";..-i
~~
N~
;:;r:
N";
~,.
N~
~:tl
N~ ~~ ~~
.-
""~
N_. N"';
.~~ ~N
....:"'l
N~
~~
,....;..-i

~~ !a
N_
:::::~
....~::<
.N
~!
~~
~a :;:~ ~~ ':"":
N~
~
"!~ ..;r; ~,.
~;l;
1<';"; ,....;.,p .... ,,; N~ ...; N~ N.-'i ""';00; ""...; N"'; N~ N~ C'~

~ I~~
,,"".
,N
~~
N.
~liI
N";
~~
N.
~Q ~~ ~~
...; M .. N~
~
N.., . ~~
~~

N'"
~-
.r;r;
N..,
-~
.r;>C!
N~
~:cj
N~
~i\
,....;,..; ,
I~:
~~ ~~ ~:I: ~~
.~
;:::.! ?6~ ~8i l~ ~~ ~~
~N
~

"".,; .... ..t ...... r;-: M';'


.r;~

.
N. N. N. N~ N~ "'...; c,~ N~

!!lro :;:1 ..,


..
N_

"' ..
-~ .N
-~
~~ ,,:;0: .~
1111 ,:e~ ;!"~
~ - 0
...;wi ~.
-=1r;
~. N.
"" "II
N . .... ..;
.... ""
OC!"'l
N.
~N
.... ..t ,..;"p .... ..;
~ ~~
...;".;

,-
.M
N.
,...;wi
;!;~
"';wi
~~
-.~
~~
~ . . ~-
--:~
~
2~ qlii!
"';"p ~~
C!~ -=1r; -=1t-;
~. ~.
~~

~.
8i~
N.
_N
~! ,

~~ ~~ ~,=
. .,' M .~
~!! ;;:!
...;.,; ...;wi ~.::8:
~~
.~

. ~~

... ... . ...


I N ~.,
"', ....1 ~= "'.~ "'"'''l
~~ "';..::j "';,Q ,...;..0 ...;,Q ~ ~~ ~~ ...;wi ~ ~~

- ~li
,:,.; '!~ :~
••
.~ ~, .1'1 ~~
..,:
~::. ~Sl
••
~;J
..r:. ..rr:.
~I
~N
..,~
.~
;!;I':
..r:.
NN
N~

.. t-=.

..:: :!: :0 !: ~ iil ;:; N


::l II ~
~
'" " N N
562 Append;x Tohle.

:e~ :;::s; ~ri ;t2


!S ~! ~! !~ 3! ~~ ~!
"';,.,j ...:"" ..;ri _:,..

~S 5~ ~S ~! ~~ ~! !!
~~ ~~ ~a ~~ ~~ ~~ ~=
_~ ~~ ~H ~H ~H ~H ~~

e j ~ ~~ -~ ~~ ~~
~" =~ ~N
:~ ~~ ~~ i~..;~
_M
~~ ~= ~_.~ ~~ ~_.•M ..;,.; _M _ft _:,.. _:,.. "...:""

l.~ Ii 2 &:~;i$ ~~ ;;:~ ;:Y1 ~~ ~~ ~$ 'i~ ~~ ;_.~ ·i_.~~ ~_.IIj..I


-('I ..;,.,j _f'I ...; ... _H -"" _f'I "':,.,j _f'I _;"'" ...

a t ~ ~~ ~5 ~~ ~! §~ ~! ~~ ~5 ~~ :~ :! =S ~!
r~ ; ~ ~~ ~~ ~~ ~~ ~~ ~~. ~~ ~~ ~~ ~~ ~~ ~~ ~~~~r r

~ f ~ ~5 ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~! ~S ~~ :::J I
-
I

~II= ~! ~~ ~~ ~~ ~! ~~ ~~ ~~ ~~ ~~ ~~ ~~ =~'
~ ~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~
~ ~

~l
"'....
!:*
.......
~:
.-.iN

~ ~~ ~:::l :;~
........ "".-i .... ...;

~~ :;:;
....... ..... .....
~~
563

~S ~~
8::
....... ...;... -....

N~ ~I: ~~ :£~ :!B ~fJ ~~


.... i.-i ........ MN ........ o-i.... Mo.i .... o.i
564 Appendi" Tobi••
!~~::::
..... ,..;..;~
.... 0"1 ..
.... ~o-
",:,,,,;..o~

....,..,."' ...
....
10._0
.
..........
....:"' ..:
~-~~g:.
-.
"' .... ...,- ""."",:o.",!
-...,"'~

~~~~
....OCI_.
r-:,,:~~
....:,.;"'~
-'"""''''
~~~:$ :S~~~ $~~$
"";"';,os:! ...:"'.,;,.... ...:~...;..d

=~~;!
",:;,;..o~

:i;;!j;~
"':...;.0,..;
~~~~
_ ....... 0.

g;~S~
.....:...;,.,::i~

"''''.., ....
OO...,.,j:I ..... r.:~~~ 0\""'1"1_
"'1'-00 ....
_;...;~~ "':N"';~ ....;,..,j";oC

",,~ .... ,...., &;~~~ ~~$1;;


C!a:~,:
""""""r:; "':"';,.;~ ...: '~"";'2
,.., .... "' ..... .... "" ........
"I:_"'!>o:!"": qa:<=<'"
O\t- "'-', co "'-00;0-
OO ..... OC'N r;C!rc"";
"' .... ~.; "' "';::; .....
";";..Q:! -..., .... ~

&;~~;;
"':"";"':!

~~:5!~ ~~~:J
....;,.;...:~
N"'O'I~

"' .... 0.""


.... 0 ... ""
"";"';"":'2

~~~~ ~~~~ g~~~ ~~~a


_M .... ~ _M'_~ ",:,,;.o~ _"""'=
565
I,j
- -
.... -0 ... '"
..... .....
"':.-1"';"';
;~;~
"':":M. ~~~~
":":M~
~~~~
__ ~...
~s,~
...:..: .... ~~,~
...:...;....;,..; ~~~~
__ M"" ~~~~
__ M~

"''"'!Of!"'!"'!
............
_ _ 1'1 ...

i -~~~~ ~=~~
........... "':f'i"' • ~88~ ~,~~ ~a~~ ~~~~ ~~~~ ~~~~ ~~~~
...:f'i .... ...:...: ... oi ";":<-i'" __ M ... _ _ ....... ..:...: ... ,.; __ N ....

!;:as::l
_,.,>Gf-.
.... -N(J'o
..:....;,..;..r ..:,..,.;;,.; -- ......
~i;~~ ~;:I!:;;
...:..:....;,..;
~~~~
_ _ N'" ~~~:!1
":"':1'4""

-"'- .....
.... - ... '0
..:,..,j ... ...;
~~~~. ;,~S
_. . M.. ";"':<"i •

--
1'1 ..... ....
.... ...
...:,...,..;".;

~
~~~; ~~~~ '~~:I
"':"";"';"'; _N ....... ...:....;,..;.
;Q;~
"':MN. ~':8
...:...: ... ,.; .~~=
...:...: ... ,.,; i'~~
...:...:........ ~~~~
__ M ...

r;;S;1=::8
":,..,j"';..i

...:........ ..:"'
1'Ill:lil'l :11;11:01:1
....... ~~~~
_ ........ :S,~
":M....;oi ;~tg
...:...: ..... ~~g~
__ ~ ...

,~~~ #~~~ ~88~ :ag~ ;~~~


...: ....... .j ...: ....... ,.; ...: ....... ._; "';N ... "; "':NM.

......... _M
"ot'_Nr- ~$~~ ;!8=t ~~Il~
"':N .... "; "':f'i"';. ":....i"';. _IN",'"

$trg$ ~~;Q~ ~~:j~ ,~~! ~~~: ~~~!O ~~,~~ ~8!:r;


"':N"~ "':...i"';..c:i ...: ... ,..;,..; "': ... ,..;"; ...: ....... ,.; "':N~ _N .... ...: ........

$i:;;~~ :;;:;~:;: :1;~~:I! *::!~$ ;g~~


_,... . . \0 "':N".;"c _N ........ "':I"i"';. "':N,..;,.

'"
;G::::::::!
":,..j";'''':'
........ - .........
.... "'sCl
..:,..;,..;"p ... ,.;...;""
.......... 1"-
..:
~~~~
_~~~
~~~~
~N~~
~N:;S
~N~~
~~~~
~N~.
t~~~
~M~ •

0."'1"1-
..... "'NOO
"':M":>Ci

...
~$~*
";N."":
566 Appendix Talol..

~:g~:; Si!:;:::~ ~;t;~~


__ "'1'1 ~:;~~
__ "IN ""M-""
"' ..... ". .... ..... 0 ........
~ "';";NN "';"';r-i"'" ,..;...;...; .... ......... 00 ....
...:....:...:,...

<IC .... _ " "


~""C!'"
__ NN

~~~~ ~~~8 ~s~~ ~,~: ~~~~ S',~=;::::


i "';"';N~ ....:...;M~ "';"';"'M "";";NN _:"';rir-i "';"';"';N

~:(;~~
...;...; ..... r-i

;;;$~&;'
--..,.
0",..:>",
"">&1<:1\-
::It::!;::;~ ;<;..c_ ...
""!r-:""!~
.... "';"": .... ,.;. .....;..: .... N ...:..: "'('~

~
~~~~
-_M ....
;3;~~.:!
"';"';N"';
FlR=!!l
_ _ N .... ~$~~
...;...; ........
N_OC""
~r-:~~
_ _ "'''I

.., ........ ....-


"'..0"'
"':":N..;

>C>M_ ....
"'"!OC!":""!
.....-0-.
................ .
--~.., "":":N..;

~~~:::
-- .... ,. ...;...: .... ,..;

"' "" ""!CJ:r:.r-:


"' .... ..0_
....
--"' ....
....
"':"';N";
(J\N ........

~:;:~~
....... ........ ....
"' .... r-
....:..:,...;,..; ....;...; .... ..;

--..,-
.... 00"'-
.......
..: .,.:
~~~~
__ N..,. ~;;~g:
...;...;.-.j,.;

..0
;:::!:;.:j~
...:,...,..;..; -......... ;:~S;~
~~~~
- ........ ,.
........ 1-'"
""Q"'-
"': .... N..r

~~::::~ ~~~~
"';,......;"; - ............ ... --
.... 0"'0-
.
..;,..,j"';".:

..... "''''
........ >&11- ......... - ... ..... --'"
...: .... ..;...; ... "''''-
...; ....
..;....;
..,."'"..0-
"':..04,..;"':

~~;:8
",: ... ,..;-,D
110 ... ", ....

- .............
"":'""!OC:"" ......... "' .
",_t-_

...: ...;.,;

,.."'V> ....
~~~;::;
..c ...............
"g", .... .,.

...;,... .. "" !;~g~


""'000-
.... ...: ..... .,.:r-: ............. ...;....i",:,o ..:....i.",
... "'N .....
"';.-i..;..c

-- .....
..,.010-,""
...;,..;....;~
"'''''''
.... ...
"'..:0 ..
"';.-i"";"':
567
.... "" .... ..., ........
M~""M
.....;....:...:'" "!~"'!"'!
-- .......... -"';>Q<:7'on
....:_....: ... $$$$ .~
.... N_,...
.........
___ N
.,...... ""'_ao,...,
.... .... .., ....
...:...:...:,..,i ~
rl~~~ ~~~~
__ ._N
"'''U··O''"'
('1 .... " " -
_..:..: ..i
q~~:;; !
~
.... "'''',-Q r-r-_'"' ;!,~:.~
<'1 .... 0'" ......... 0 ...,
....:....: ... ri ---I", .~:f.~_!
....:-"'~
___ N .
... _r- ....
~~~;:; r,~~~ :q;,;g;~ ;!,~~~
~ _ _ <"'1M
....:...: ....... ",,"'0-0
...;....: ..i"i ---q
~l~~ i(;;;:?'
...:....: ..iN "':_"':N

NN __
~;:::::c:~ ........... Q
....:....: .... ....; _;...; .... ..;
.... '" --"'...,
"'- __
-- ~ ....
"!"'~~
........
....... -00>
........
"":"":N"'; "'1'4 "')r-:"'!~ ~$~~
_ _ M"l
~$S::r.
"':"':NN

...."'''' ....
"')r-.~~_
--,. ..., . """'! "'! "'-
.... <'I ....
__ -: ....,

....... _
-............ '--.-"',
..:...: ... ....;
0 "'!"'-:"':
£

-- .......
.... 0< .... -
'"'!~"!~
.
.... 00 ..... <11>
.... ""-0 ....
...:.....:,. ..;
~;:;J.~
....:....:..,.;...;
""...;....:.... -.. i...;-
........... ....

... ""-
"'"':":,.,i.....;
.... " ' .... 00
..,..,"' .... .... "' .......i...;........ ...."'....:...:.......
"" .........
.... 1)0, .... ....
...:..: .... ..; ..;....:
,.,"' -
... ........;
~$~$ ~8~8 ~8::~
....... ..,.-
"'!~r-:r-::
__ N..-,
............ a-
,...,"""'". ..,NNOC
.... "".,.,M
'" _ ............ ..; ..iM..j .-:....:,...j...; _;...;"'....;

~S~-4 ;;;:~~Sl :t:~$4 -"' ....., .....


".~~,..! ....... "''''
,..,,.,..,...,.....
..., _N ........ _"' ........ _N ........ - ........... ...;....:"'....; _ _ N ....

~~~~
_,.., .......
........ r-.:;

-<I:-:"!'"
...,........
~$:::~
_"',., ...
:;,;;':!
...:...;...; .. ~g:~~
...:....:,,;,..;

...- ...,.,.,., """'_:>0


..,Q-,",""
~~., ... "''''..,;..., r' .... ,~..,
-1""" "I:Q- ;: ::;;1:;:::' ~.-.
.-, ...
_ ......,~
or
..:"'..: _M.-.... "'; ... j~.
-""""..,.
~~::;~.~~Qq
...; ..... .,:,.; _M""..c

..,,,:;>- ....
~~~:; ...._ "'1"',..,
~'7>-'"
....... "'M
- - "'''' ......
...
or,~
"';.....;";,.:
568 AppeHiJr TaI>I..
r- ..... __ """
~~~~~
-0\'>0" ......
N .......... oo,o ~ ..... N_O

oo~r-:"o..o ..oIl:)..o.Q.Q .,..;"''''.,..;.,..;


gQon __
;;::~~~R
.... N .. N .... 00 0 00
_'I'loooon :;~~8~
00"Q"':0-: ocir--:r--:-o\C) .,c)..Q.o..o...-i II') "';.,.j"";·vi
~--

~~!!;~~ ~~;!~~
,..:i",,,,,;,,,-\v\
IQ'O\OI,QV'\

~,,'N

",,:..cooO ~~~~~
r-"-Nf"l\C)
N-OOd' N .... QQ\OO
t""-t--IO>CIQ ..o-o..c"";.,.j
~--
~~~&:;;s: ~:8:Cl;;C::
....:,..:..0..0.0 vlvi"";"";"";

.. :a~~
"';"";0(1()
~:!jQ~~
"':,...;,,0..0..0
~~gg~~
.c.,; vi""; vi
~~~s;:~
.,..;.,.;.,..;.,.;.,..;
f""IN--
..,.!""IN-
"",;.n"";,,,,;
~--

$!~~~
} "~N
t'<'>1"'")f<">V)

i~::!oci
N~~::::;
r;--:,...:..c..o-o
!;t~;1;~$
M _ _ t<"IlC
~OCIr--\Olr\
r;:;~~~~
",viori",vi

~:J~~~

I
f""i"';OaO r-:...c-c..c"p on"";"";"";"'; .,.;.,.;..,.;.,..;,,;
~--

a- ~~­
..... O-r>t ~lit~~~ ~r:::c~~
-l..j.O;oC r-:..c.Q..o",; .,-).,..;.,..;.,..;.,..;
~- _ _ NN
-I,Q-r-'It
"'N~
\Of<')f'o.O !::~~:£:;c ~:C~~~ <"'IMM __ _00>,010
~:!Q\~ ~.c..o..o.n "',,",...-i"";""; ...-i.n",..n",,; V\'('\..,:..,:

~~=s~
_~f"")""
-g;~:;;: ~~~~:q .,.jon"";"";"";
OCl'oor--
vi..,:..,:.
~~~r....: \()\rj"">n>l')

'.C ...... t-o..


~~g:l::~ -ctf'l"lM __ ('0",
~8g:~~
-c"o.,..;.,..;,,-i vivjvi"";ori
vi"'."*'<i
) .-
_~~8
~~~
..:r 0 00 .....
"';.-.)QCjr-:

... -
~NQI6r....:
~~~~~
..c..ovi.~"";

8~~~~
-0 ...... " ' ...... ."
~?:i~~s:
ori.,-i"";.,.j"':

N ..... Y"'I
""':C!~OC!~
.,.."'''<I'~'¢
00"'"
~i~~;t
"Ii"':"':"':"':

~;tR~~
..f..f'<i-<i'<i

~I
~ i
~;:!!;:::
i::a¢-D
MI""IV)I'"-N
OlCf""I-O
-.0"";"";"-)"";
;:~~$$
«i«i...j-«i..r
:::> 'I
M~~~

I
-
....r--=o,..:..o
~~~~~
...... \CIQ<n_
c .... '"
~OC'?;:.Or:
...... r-·'u··.j __
"';.«i«i«i

...
rtO'.,cV) ...t~.«i...t
~

~~~!~
~-.". OONf'o-MO
C'; .... O'O OOOOf'O-t--f-
,...0:).,.)", ..;.~ ~ f"'i1""il'f")M...-i
N

o~~~ "~~;:r~g ~=~;g8


0(") NQ'\ \CI
"'(7\0000
~..Q...t.w; "";"";""';...-i...-i f""i""Mf'"i""; NNNN
569
TABLE. A 16
ANGLES CORRESPONDING TO PERCENTAGES, ANGLE = ARC.SJN.,jPERCENTAOE,
AS GIVEN ~ C. 1. BLIss·

% 0 2 3 4 $ 6 7 8 9

0.0 0 0.57 0.81 0.99 1.15 1.28 1.40 1.52 1.62 L12
0.1 1.81 1.90 1.99 2.07 2.14 2.22 2.29 2.36 2.43 2.50
0.2 2.56 2.63 2.69 2.75 2.81 2.87 2.92 2.98 3.03 3.09
0.3 3.14 3.19 3.24 3.29 3.34 3.39 3.44 3.49 3.53 3.58
0.4 3.63 3.67 3.72 3.76 3.80 3.85 3.89 3.93 3.97 4.01

0.5 4.05 4.09 4.13 4.17 4.21 4.25 4.29 4.33 4.37 4.40
0.6 4.44 4.48 4.52 4.55 4.59 4.62 4.66 4.69 4.73 4.76
0.7 4.80 4.83 4.87 4.90 4.93 4.97 5.00 5.03 5.07 5.10
0.8 5.13 5.16 5.20 5.23 5.26 5.29 5.32 5.35 5.38 HI
0.9 5.44 5.47 5.50 5.53 5.56 5.59 5.62 5.65 5.68 5.71

I 5.74 6.02 6.29 6.55 6.80 7.04 7.27 7.49 7.71 7.92
2 8.13 8.33 8.53 8.72 8.91 9.10 9.28 9.46 9.63 9.81
3 9.98 10.14 10.31 10.47 10.63 10.78 10.94 11.09 11.24 IU9
4 11.54 11.68 11.83 11.97 12.11 12.25 12.39 12.52 12.66 12.79

5 12.92 13.05 13.18 13.31 13.44 13.56 13.69 D.81 D.94 14.06
6 14.18 14.30 14.42 14.54 14.65 14.77 14.89 15.00 15.12 15.23
1 15.34 15.45 15,56 15.68 15.79 15.89 16,00 16,11 16,22 1632
8 16.43 16.54 16.64 16.74 16.85 16.95 17.05 17.16 17.26 17.36
9 17.46 17.56 17.66 17.76 17.85 17.95 18.05 18.15 18.24 18.34

10 18.44 18.53 18.63 18.72 18.81 18.91 19.00 19.09 19.19 19.28
II 19.37 19.46 19.55 19.64 19.73 19.82 19.91 20.00 20.09 20.18
12 20.27 20,36 20.44 20.53 20.62 20.10 20,79 20.88 20.96 21.05
13 21.13 21.22 21.30 21.39 21.47 21.56 21.64 21.72 21.81 21.89
14 21.91 22.06 22.14 22.22 22.30 22.38 22.46 22.55 22.63 Zl.71

15 22.79 22.87 22.95 23.03 23.11 23.19 23.26 23.34 2),42 23.50
16 23.58 23.66 23.13 23.81 23.89 23.97 24.04 24.12 24.20 24.n
17 24.35 24,43 24.50 24.58 24.65 24,73 24,80 24.88 24.95 25.03
18 25.10 25.18 25.25 25.33 25.40 25.48 25.55 25.62 25.70 2577
19 25.84 25.92 25.99 26.06 26.13 26.21'>', 26.28 26.35 26.42 26.49

20 26,56 26.64 26.71 26.18 26.85 26.92 26.99 27.06 21.1.1 27.20
21 27,28 27.35 27.42 27,49 27.56 27.63 27.69 27.76 27.83 17.90
22 27.97 28.04 28.11 28.18 28.25 28.32 28.38 28.45 28,52 28.59
23 28.&; 28.73 28.79 28:86 28.93 29,00 29,(16 19.13 29.20 29.27
24 29.33 29.40 29.47 29.53 29.60 29.67 29.73 29.80 29.87 29.93

25 30.00 30.07 3O.J3 30.20 30.26 30,33 30.40 3O.4/> 30.53 10,$9
26 30,66 30.72 20.79 30.85 30,92 30.98 31.05 3L11 31.18 31.24
27 3UI 3U7 31.44 31.50 31.56 31.63 31.69 31.76 31.82 31.88
28 31.95 32.01 32.08 32.14 32.20 32.27 32.33 32.39 32,46 32.S2
29 32.58 32.65 32.71 32.77 32.83 32.90 32.% 33.02 3).(19 33.15

• We are indebted (0 Dr. C. 1. Bliss for permission to reprodua this laole. whic'h
appeared in Plant PrOlection. No. 12. Leningrad (1937).
(Tohl~A /6 ("Ontinuedon pp. 570-71)
570 Appendix Tobi.,
TABLE A 16-(Continueci)

% 0 2 3 4 5 6 7 8 9

30 33.21 33.27 33.34 33.40 33.46 33.52 33.58 33.65 33.71 33.77
31 33.83 33.89 33.96 34.02 34.08 34.14 34.20 34.27 34.33 34.39
32 34.45 34.51 34.57 34.63 34.70 34.76 34.82 34.88 34.94 35.00
33 35.06 35.12 35.18 35.24 35.30 35.37 35.43 35.49 35.55 35.61
34 35.67 35.73 35.79 3'.85 35.91 35.97 36.03 36.09 36.15 36.21

35 36.27 36.33 36.39 36.45 36.51 36.57 36.63 36.69 36.75 36.81
36 36.87 36.93 36.99 37.05 37.11 37.17 37.23 37.29 37.35 37.41
37 37.47 37.52 37.58 37.64 37.70 37.76 37.82 37.88 37.94 38.00
38 38.06 38.12 38.17 38.23 38.29 38.35 38.41 38.47 38.53 38.59
39 38.65 38.70 38.76 38.82 38.88 38.94 39.00 39.06 39.11 39.17

40 39.23 39.29 39.35 39.41 39.47 39.52 39.58 39.64 39.70 39.76
41 39.82 39.87 39.93 39.99 40.05 40.11 40.16 40.22 40.28 40.34
42 40.40 40.46 40.51 40.57 40.63 40.69 40.74 40.80 40.86 40.92
43 40.98 41.03 41.09 41.15 41.21 41.27 41.32 41.38 .41.44 41.50
44 41.55 41.61 41.67 41.73 41.78 41.84 41.90 41.96 42.02 42.07

45 42.13 42.19 42.25 42.30 42.36 42.42 42.48 42.53 42.59 42.65
46 42.71 42.76 42.82 42.88 42.94 42.99 43.05 43.11 43.17 43.22
47 43.28 43.34 43.39 43.45 43.51 43.5.7 43.62 43.68 43.74 43.80
48 43.85 43.91 13.97 44.03 44.08 44.14 44.20 44.25 44.31 44.37
49 44.43 44.48 44.54 44.60 44.66 44.71 44.77 44.83 44.89 44.94

50 45.00 45.06 45.11 45.17 45.23 45.29 45.34 45.40 45.46 -45.51
51 45.57 45.63 45.69 45.75 45.80 45.86 45.92 45.97" 46.03 46.09
52 46.15 46.20 46.26 46.32 46.38 46.43 46.49 46.55 46.6.1 46.66
53 46.72 46.78 46.83 46.89 46.95 47.01 47.06 47.12 47.18 47.24
54 47.29 47.35 47.41 47.47 47.52 47.58 47.64 47.70 47.75 47.81

55 47.87 47.93 47.98 48.04 48.10 48.16 48.22 48.27 48.33 48.)9
56 48.45 48.5 0 48.56 48.62 48.68 48.73 48.79 48.85 48.91 48.97
57 49.02 49.08 49.14 49.20 4926 49.31 49.37 49.43 49.49 49.54
58 49.60 49.66 49:72 49.78 49.84 49.89 49.95 50.01 50.07 50.13
59 50.18 50.24 50.30' 50.36 50.42 50.48 50.53 50.59 50.65 50.71

60 50.77 50.83 50.89 50.94 51.00 51.06 51.12 51.18 51.24 51.30
61 51.35 51.41 51.47 51.53 51.59 51.65 51.71 51.77 51.83 51.88
62 51.94 52.00 52.06 52.12 52.18 52.24 52..30 52.36 52.42 52.48
63 52.53 52.59 52.65 52.71 52.77 52.83 52.89 52.95 5).01 53.07
64 53.13 53.19 53,25 53.31 53.37 5J.43 53.49 53.55 53.61 53.67

65 53.73 53.79 53.85 53.91 53.97 54.03 54.09 54.15 54.21 54.27
66 54.33 54.39 54.45 54.51 54.57 54.63 54.70 54.76 54.82 54.88
67 54.94 55.00 55.06 55.12 55.18 55.24 55.30 55.37 55.43 55.49
68 55.55 55.61 55.67 55.73 55.80 55.86 55.92 55.98 56.04 56. "
69 56.17 56.23 56.29 56.35 56.42 56.48 56.54 56.60 56.66 5613
571
TABLE A 16-(Continued)

/'~ 0 2 3 4 5 . 6 7 8 9

70 56.79 56.85 56.91 56.98 57.04 57.10 57.17 57.23 57.29 57.35
71 57.42 57.48 57.54 57.61 57.67 57.73 57.80 57.86 57.92 57.99
72 58.05 58.12 58.18 58.24 5UI 58.37 58.44 59.50 58.56 58.63
73 58.69 58.76 58.82 58.89 58.95 59.02 59.08 59.15 59.12 59.28
74 59.34 59.41 59.47 59.54 59.60 59.67 59.74 59.80 59.87 59.93

75 60.00 60.07 60.13 60.20 60.27 60.33 60.40 60.47 60.53 60.60
76 60.67 60.73 60.80 60.87 60.94 61.00 61.07 61.14 61.21 61.27
77 61.J4 61.41 61.48 61.55 61.62 61.68 61.75 61.82 61.89 61.96
78 62.03 62.10 62.17 62.24 62.31 62.37 62.44 62.51 62.58 62.65
79 62.72 62.80 62.87 62.94 63.01 63.08 63.15 63.22 63.29 63.36

80 63.44 63.51 63.58 63.65 63.72 63.79 63.87 63.94 64.01 64.08
81 64.16 64.23 64.30 64.38 64.45 64.52 64.60 64.67 64.75 64.82
82 64.90 64.97 65.05 65.12 65.20 65.27 65.35 65.42 65.50 65.57
83 65.65 65.73 65.80 65.88 65.96 66.03 66.11 66.19 66.27 66.34
84 66.42 66.50 66.58 66.66 66.74 66.8i 66.89 66.97 67.05 67.13

85 67.2i 67.29 67.37 67.45 67.54 67.62 67.70 67.78 67.86 67.94
86 68.03 68.11 68.19 68.28 68.36 68.44 68.53 68.61 68.70 68.78
87 68.87 68.95 69.04 69.12 69.21 69.30 69.38 69.47 69.56 69.64
88 69.73 69.82 69.91 70.00 70.09 70.18 70.27 70.36 70.45 70.54
89 7M3 70.72 70.81 70.91 71.00 71.09 71.19 71.28 71.37 71.47

90 71.56 71.66 71.76 71.85 71.95 72.05 72.15 72.24 72.34 72.44
91 72.54 72.64 72.74 72.84 72.95 73.05 73.15 73.26 73.36 73.46
92 73.57 73.68 73.78 73.89 74.00 74.11 74.21 74.32 74.44 74.55
93 74.66 74.77 74.88 75.00 75.11 75.23 75.35 75.46 75.58 75.70
94 75.82 75.94 76.06 76.19 76.31 76.44 76.56 76.69 76.82 76.95

95 77.08 77.21 77.34 77.48 77.61 77.75 77.89 78.03 78.17 78.32
96 78.46 78.61 78.76 78.91 79.06 79.22 79.37 79.53 79.69 79.86
97 80.02 80.19 80.37 SO.54 80.72 80.90 81.09 81.28 81.47 81.67
98 81.87 82.08 82.29 82.51 82.73 82.96 83.20 83.45 83.71 83.98

99.0 84.26 84.29 84.32 84.35 84.38 84.41 84.44 84.47 84.50 84.53
99.1 84.56 84.59 84.62 84.65 84.Q8 84.71 84.74 84.77 84.80 84.84
99.2 84.87 84.90 84.93 84.97 85.00 . 85.03 85.07 85.10 85.13 85.17
99.3 85.20 85.24 85.27 85.31 85.34 85.38 85.41 85.45 85.48 85.52
99.4 85.56 85.60 85.63 85.67 85.71 85.75 85.79 85.83 85.87 85.91

99.5 85.95 85.99 86.03 86.07 86.11 86.15 86.20 86.24 86.28 86.33
99.6 86.37 86.42 86,47 86.51 86.56 86.61 8M6 86.71 86.76 86.81
99.7 86.86 86.91 86.97 87.02 87.08 87.13 87.19 87.25 87.31 87.37
99.8 87.44 87.50 87.57 87.64 87.71 87.78 87.86 87.93 81.01 88.10
99.9 88.19 88.28 88.38 88.48 88.60 88.72 88.85 89.01 89.19 89.43

\00.0 90.00
572 Appendix Tabl..

1+1+1+

1"--0\.11)_ ..........
-NN .....
I I I I 1+

++++++
••

~-
00
J I I 1++

+++1+
...II:; !
+ I I +
... -1 ~. -o_r--M
++1+

01---I 1+
o
-;-
\I
• N-on_ ..... N
-('OJ .....

\ \ \ \+
-'Ot


)o::N .......... e.,..
I I +
573

TABLE A 18
TABLE OF SQUARE ROOTS

n ..In ..lIOn n ..In .j11Pr n ..jn ..110.


1.00 1.00 3.16 2.00 1.41 4.47 3.00 1.73 5.48
1.02 1.01 3.19 2.02 1.42 4.49 3.02 1.74 5.50
1.04 1.02 3.22 2.04 1.43 4.52 3.04 1.74 5.51
1.06 1.03 3.26 2.06 1.44 4.54 3.06 1.75 5.53
1.08 1.04 3.29 2.08 1.44 4.56 3.08 1.76 5.55
1.10 1.05 3.32 2;10 1.45 4.58 3.10 1.76 5.57
1.12 1.06 3.35 2.12 1.46 4.60 3.12 1.77 5.59
J,14 1.07 3.38 2.14 1.46 4.63 3.14 1.77 5.60
1.16 1.08 3.41 2.16 1.47 4.65 3.16 1.78 5.62
1.18 1.09 3.44 2.18 1.48 4.67 3.18 1.78 5.64

1.20 1.10 3.46 2.20 1.48 4.69 3.20 1.79 5.66


1.22 1.10 3.49 2.22 1.49 4.71 3.22 1.79 5.67
1.24 1.11 3.52 2.24 1.50 4.73 3.24 1.80 5.69
1.26 I 1.12 3.55 2.26 1.50 4.75 3.26 1.81 5.71
1.28 1.13 3.58 2.28 1.51 4.77 3.28 1.81 5.73

1.30 1.14 3.61 2.30 1.52 4.80 3.30 1.82 5.74


1.32 1.15 3.63 2.32 1.52 4.82 3.32 1.82 5.76
1.34 1.16 3.66 2.34 1.53 4.84 3.34 1.83 5.78
1.36 1.17 3.69 2.36 \.54 4.86 3.36 1.83 5.80
1.38 \,17 3.71 2.38 1.54 4.88 3.38 1.84 5.81
1.40 1.18 3:74 2.40 1.55 4.90 3.40 1.84 5.83
\.42 1.19 l.77 2.42 1.56 4.92 3.42 1.85 5.85
1.44 1.20 3.79 2.44 1.56 4.94 3.44 1.85 5.87
1.46 1.21 ·3.82 2.46 1.57 4.96 3.46 1.86 5.88
1.48 1.22 3.85 2.48 1.57 4.98 3.48 1.87 5.90
1.50 1.22 3.87 2.50 1.58 5.00 3.50 1.87 5.92
1.52 1.23 3.90 2.52 1.59 5.02 3.52 1.88 5.9,
1.54 1.24 3.n 2.54 1.59 5.04 3.54 1.88 5.95
1.56 1.25 3.95 2.56 1.60 5.06 3.56 1.89 5.97
1.58 1.26 3.97 2.58 1.61 5.08 3.58 1.89 5.98

1.60 1.26 4.00 2.60 1.61 5.10 3.60 1.90 6.00


1.6;! 1.27 4.02 2.62 1.62 5.12 3.62 1.90 6.02
1.64 1.28 4.05 2.64 1.62 5.14 3.64 1.91 6.03
1.66 1.29 4.07 2.66 1.63' 5.16 3.66 1.91 6.05
1.68. 1.30 4.10 2.68 1.64 5.18 3.68 1.92 6.07
1.70 1.30 4.12 2.70 1.64 5.20 3.70 1.92 6.08
1.72 1.31 4.15 2.72 1.65 5.22 3.72 1.93 6.10
1.74 1.32 4.17 2.74 1.66 5.23 3.74 1.93 6.12
1.76 1.33 4.20 2.76 1.66 5.25 3.76 1.94 6.13
1.78 1.33 4.22 2.78 1.67 5.27 3.78 1.94 6.15

1.80 1.34 4.24 2.80 1.67 5.29 3.80 1.95 6.16


1.82 1.35 4.~7 2.82 1.68 5.31 3.82 1.95 6./~
1.84 1.36 4.29 2.84 1.69 5.33 3.84 1.96 6.20
1.86 /.36 4.3/ 2.86 1.69 5.35 3.86 1.96 6.1/
1.88 1.37 4.34 2.88 1.70 5.37 3.88 1.97 6.23
1.90 /.38 4.36 2.90 1.70 5.39 3.90 1.97 6.25
1.92 1.J9 4.38 2.92 1.71 5.40 3.92 1.98 6:26
1.94 1.39 4.40 2.94 i.71 5.42 3.94 1.98 6.21
1.96 1.40 4.43 2.96 1.72 5.44 3.96 1.99 6.2'
1.98 1..41 4.45 i.98 1.73 5.46 3.98 1.99 6.31
574 Appendix Tables
TABLE OF SQVARE RO()TS-(Continued)

n in i lO n in ,,/1011 n in .,jIOn
-. "
4.00 2.00 6.32 5.00 2.24 7.07 b.tlO 2.45 7.75
4.02 2.00 6.34 5.02 2.24 7.09 6.02 2.45 7.76
4.04 2.01 6.36 5.04 2.24 7.10 6.04 2.46 7.77
4.06 I 2.01 6.37 5.06 2.25 7.11 6.06 2.46 7.78
4.08 2.02 6.39 5.08 2.25 7.13 6.08 2.47 7.80
4.10 2.02 6.40 5.10 2.26 7.14 ('1.]0 2.47 7.81
4.12 2.03 6.42 5.12 2.26 7.16 6.12 2.47 7.82
4.14 2.03 6.43 5.14 2.27 7.17 6.14 2.48 7.84
4.16 2.04 6.45 5.16 2.27 7.18 6.16 2.48 7.85
4.18 2.04 6.47 5.18 2.28 7.20 6.18 2.49 7.86
4.20 2.05 6.48 5.20 2.28 7.21 6.20 2.49 7.87
4.22 2.05 6.50 5.22 2.28 7.22 6.22 2.49 7.89
4.24 2.06 6.51 5.24 2.29 7.24 6.24 2.50 7.90
4.26 2.06 6.53 5.26 2.29 7.25 6.26 2.50 7.91
4.28 2.07 - 6.54 5.28 2.30 7.27 6.2S 2.51 7.92
4.30 2.07 6.56 5.30 2.30 7.28 6.30 2.51 7.94
4.32 2.08 6.57 5.32 2.31 7.29 6.32 2.51 7.95
4.34 2.08 6.59 5.34 2.:n 7.31 6.34 2.52 7.96
4.36 2.09 6.60 5.36 2.32 7.32 6.l6 2.52 7.97
4.38 2.09 6.62 5.38 2.32 7.33 6.J~ 2.53 7.99
4.40 2.10 6.63 • 5.40 2.32 7.l5 6.40 2.53 8.00
I
4.42 2.10 6.65 5.42 I 2.33 D6 6.42 2,53 8.01
4.44
4.46
2.11
2.11
6.66
6.68
5.44
5.46
I 2.ll
2.34
7.38
7.39
6.44
6.46
2.54
2.54
8.02
·8.04
4.48 2.12 6.69 5.48 I 2.34 7.40 6.48. 2.55 8.05

4.50
I 2.12 6.71 5.50 i "-35 7.42 6.50 2.55 8.06
4.52 2.13 6.72 5.52 2.35 7.43 6.52 2.55 8.07
4.54
4.56
! 2.13
2.14
6.74
6.75
5.54
5.56 I 2.35
2.36
7.44
7.46
6.54
6.56
2.56
2.56
8.09
8.10
4.58 2.14 6.77 5.58 2.36 7.47 6.58 2.57 8.11
4.60 2.14 6.78 5.60 2.37 7.48 6.60 2.57 8.12
4.62 2.15 · ......6Jm 5.62 2.37 7.50 6.62 2.57 8.14
4.64 2.15 6.81 5.64 2.37 7.51 6.64 2.58 8.15
4.66 2.16 6.8l 5.66 2.38 7.52 6.66 2.58 8.16
4.68 2.16 6.84 5.68 2.38 7.54 6.68 2.58 8.17
4.70 2.17 6.86 5.70 2.39 7.55 ·6.70 2.59 8.19
4.72 2.17 6.87 5.72 2.39 7.56 6.n 2.59 8.20
4.74 2.18 6.88 5.74 2.40 7.58 6.74 2.60 8.21
4.76 2.18 6.90 5.76 2.40 7.59 6.76 2.60 8.22
4.78 2.19 6.91 5.78 2.40 7.60 6.78 2.60 8.23
4.80
4.82
2.19
2.20
6.93
6.94
5.80
5.82
I 2.41
2.41
7.62
7.63
6.80
6.82
2.6I
2.61
8.25
8.26
4.84 2.20 6.96 5.84 2.42 7.64 6.84 2.62 8.27
4.86 2.20 6.97 5.86 2.42 7.66 6.86 2.62 8.28
4.88 2.21 6.99 5.88 2.42 7.67 6.88 2.62 8.29
4.90 2.21 7.00 5.90 2.43 7.68 6.90 2.63 8.31
4.92 2.22 7.01 5.92 2.43 7.69 6.92 2.63 8.32
4.94 2.22 7.03 5.94 2.44 7.71 6.94 2.63 8.33
4.96 2.23 7.04 5.96 2.44 7.72 6.96 2.64 8.34
4.98 2.23 7.06 5.98 2.45 7.73 6.98 2.64 8.35
575
TABI.F. OF SQUARE ROOTS--(Continued)

n .jn .jIOn n .jn .jIOn n .jn .jlOn


-7.00- 2.65 8.37 8.00 2.83
-
8.94 9.00
1---------
3.00 9.49
7.02 2,65 8.38 8.02 2.83 8.96 9.02 3.00 9.50
7.04 2.65 8.39 8.04 2.84 8.97 9.04 3.01 9.51
7.06 2.66 8.40 8.06 2.8' 8.98 9.06 3.01 9.52
7.08 2.66 8.41 B.08 2.84 8.99 9.08 3.01 9.53

7.10 2.66 8.43 8.10 2.85 9.00 9.10 3.02 9.54


7.12 2.67 8.44 8.12 2.85 9.01 9.12 3.02 9.55
7.14 2.67 8.45 8.14 2.85 9.02 9.14 3.02 9.56
7.16 2.68 8.46 8.16 2.86 9.03 9.16 3.03 9.57
7.18 2.68 8.47 8.18 2.86 9.04 9.18 J.OJ 9.58

7.20 2.68 8.49 8.20 2.86 9.06 9.20 J.03 9.59


7.22 2.69 850 8.22 2.87 9.07 9.22 3.04 9.60
7.24 2.69 8.51 8.24 2.87 9.08 9.24 3.04 9.61
7.26 2.69 8.52 8.26 2.87 9.09 9:26 J.a,! 9.62
7.28 2.70 8.53 8.28 2.88 9.lD 9.28 J.05 9.63

7.JlJ 2.70 8.54 8.30 2.88 9.11 9.30 3.05 9.64


7.32 2.71 8.56 8.32 2.88 9.12 9.32 3.05 9.65
7.34 2.71 8.57 8.34 2.89 9.13 9.34 3.06 9.66
7.36 2.71 8.58 8.36 2.89 9.14 9.36 3.06 9.67
7.38 I, 2.72 8.59 8.J8 2.89 9.15 9.38 3.06 9.68
7.40 I 2.72 8.60 8.40 2.90 9.17 9.40 3.07 9.70
7.42 I 2.72 8.61 8.42 2.90 9.18 9.42 3.07 9.71
7.44 2.73 8.63 8.44 2.91 9.19 9.44 3.07 9.72
7.46 I 2.73 8.64 8.46 2.91 9.20 9.46 3.08 9.73
7:48 I 2.73 8.65 8.48 2.91 9.21 9.48 3.08 9.74

7.50 2.74 8.66 8.50 2.92 9.22 9.50 3.08 9.75


7.52 2.74 8.67 8.52 2.92 9.23 9.52 3.09 9.16
7.54 2.75 8.68 8.54 2.92 9.24 9.54 3.09 9.77
7.56 2.75 8.69 8.56 2.93 9.25 9.56 3.09 9.78
7.58 2.75 8.71 8.58 2.93 9.26 9.58 3.10 9;19

7.60 2.76 8.72 8.60 2.93 9.27 9.60 3.10 9.80


7.62 2.76 8.73 8.62 2.94 9.28 9.62 3.10 9.81
7.64 2.76 8.74 8.64 2.94 9.30 9.64 3.10 9.82
7.66 2.77 8.75 8.66 2.94 9.31 9.66 3.11 9.83
7.68 2.77 8.76 8.68 2.95 9.32 9.68 3.11 9.84
7.70 2.77 8.77 8.70 .:95 9.33 9.70 3.11 9.85
7.72 2.78 8.79 8.72 2.95 9.34 9.n 3.12 9.86
7.74 2.78 8.80 8.74 2.96 9.35 9.74 3.12 9.87
7.76 2.79 8.81 8.76 2.96 9.36 9.7~ ,3.12 9.88
7.78 2.79 8.82 8.78 2.96 9.37 9.78 3.13 9.89
7.80 2.79 8.83 8.80 2.97 9.38 9.80 3.13 9.90
7.82 2.80 8.84 8.82 2.97 9.39 9.82 3.13 9.91
7.84 2.80 8.85 8.84 2.97 9.40 9.84 3.14 9.92
7.86 2.80 8.86 8.86 2.98 9.41 9.86. 3.14 9.93
7.88 2.81 8.87 8.8S" 2.98 9.42 9.88 3.14 9.94

7.90 2.81 8.89 8.90 2.98 9.43 9.90 3.ll 9.95


7.92 2.81 8.90 8.92 2.99 9.44 9.92 3.1S 9.96
7.94 2.82 8.91 8.94 2.99 9.46 9.94 3.15 9.97
7.96 2.82 8.92 8.96 2.99 9.47 9.96 3.16 9.98
7.98 2.82 8.93 8.98 3.00 9.48 9.98 3.16 9.99
Author index

Abbey, H.-2IS, 226 Caffrey, D. 1.-256


Abelson, R. P.-246, 257 Cannon, C. Y.-338
Acton, F. 5.-157,171 Casida. L. E.-134
Anderson. R. L.-380 Catchpole, H. R.-118
Andre, F.-242, 256 Chakravarti, I. M.-235, 256
Anscombe, F. 1.-322, 332, 338 Chapin, F. S.-IIS
Arbaus, A. G.-226, 227 Cheeseman, E, A.-2S6
Armitage, P.-247, 248, 257 Clapham, A. R.-522, 539
Aspin, A. A.-liS, 119 Clarke, G. L.-298, 330, 338
Autrey, K. M.-33& Cochran, W. G.-90, 115, 118, 119,226,255,
256, 298, 337, 338, 380, 418, 446, 503,
Baker, P. M.--226 539
Balaam, L. N.-274, 298 Collins, E. V.-17l
Barnard, M. M.-446 Collins, G. N.·-128. 134
Bartholomew, D. 1.-244, 246, 257 Corner, G. W,'-110, 118
Bartlett, M. 5.-296, 297, 29S, 32S, :l3S, Cox, D. R.-108, 118
376,432,495,496,503 Crall, J. M.---446
Beadles, 1. R.-96, liS Crampton, E. W.-llS
Beale, H. P.-94, 95, liS Crathorne. A. T.-197
Becker, E. R.--4S7, 503 Crow, E. L.-31
Beecher, H. T.-446 Culbert;on, G. C.-198
Behrens, W, V.-lIS, 119 Cushny, A. R.--65
Bennett, B. M.-227
Berkson, J.-165, 166, 171 "
DasGupta,K. P.-380
Bernoulli, J.-32 David, F. N.-198
Best, E. W. R.-226 David, S. T.-198
Black, C. A.-4I& Davies, O. L.-380
Bliss, C.I.-327, 569 Dawes, B.-163, 171
Bortkewitch, L. von-225 Dawson. W. T.-457
Box, G. E. P.-29S, 396 Dean, H. L.-118
Blandt, A. E.-175, 197, S03 Decker, G. C.-242, 256
Breneman, W. R.-102. lIS, 152, 17t. 503 Deming, W. E.-517. 539
Brindley. T. A.-S DeMoivre, A.-32
Brooks, S.-532, 539 Dixon, W. ).-134
Bross, I. D. 1.-246, 257, 285, 298 Doolittle, M. H.-403, 406, 418
Brown, B.-503 Draper, N. R.---418
Brunson, A. M.-402. 418 Duncan, D. B.-274, 298, 446
Burnett. L. C.-242, 256 Duncan, O. 0.-418
Burroughs, W.-96, 118 Dwyer, P. S.-418
Butler, R. A.-338 Dyke, O. V.-SOI, 503
577
578 Author Index
Eden. T.- -198 Hopkins, C £.-418
Edwards. -A. W. F.---256 Hotelling, H.-399, 414, 417, 418
Ehrenkrantz. F.-J34 Hsu, P.-227
Eid, M. 1.-418 Hurwitz. W. N.-S34, 539
Evvard,J, M.---175.19~
emmer, F. R.-529, 539
Federer, W. T.--338. 492. 503 Ipsen. J.-246, 257
Felsenstein.1.-256 Irwin, J. 0.-256
Finney, O. J.-~227, 446 Irwin, M. R.·--233, 256
Fisher, C. H.-198 I waskiewicz, K. -119
Fisher. R. A.-60. 65. 9U. 108-109, \ 15.117,
118. 119, 113, 114. 163. i 71. 184 185, 187. James. G. S.-119
198.217,221.'2.27.232. :>46.250.257, )59, Jessen, R J.,-250, 257, 503
265,212,298.311-312,337,339,380,)99,
414,418,419,446,463,471. 549, 557, 561 Kahn, H. A.-·-503
Fitzpatrick. T. B.-539 Keeping. E. S.-·3l
Forster, H. C.--337 Kempthorne, 0.--316, 337, 380, 418, 479,
Francis, T. J.-226 503
Freeman, F. N.-295, 298 Kendall, M. G.-134, 194, 195, 198
Freeman. H.:-31 Kerrich. J. E.--226. 221
Frobisher. M.---226 Keuls, M.-273, 274, 298, 421, 442
Kimball, A. W.-257
Galton. F.---1,64. 171, 177-178,198 King, A. J.-539
Ganguli. M.-291, 298 Kish, J. F.-471
Gart, 1.-497, 503 Kish, L-539
Gates, C. E.-291, 298 Klotz, J.- 1'34
Gauss, C. F.-147, 390:467. 469 Kolodziejczyk, St.-I 19
Geary. R. C.---':S8, 90 Kurtz, T. E,-215.29&
~~-
Goodman, L. A.-S02, 503
Gosset. W. S.-60 Latscha, R.-227
Gowen, J. W.-453, 503 Lee, A.-17J, 172, 115, 196, 197
Gower, J. C.-291, 298 Leggatt, C W.-·227, 233, 256
Gram, M. R.-'--41S Lehmann, E. L-134
Graybill, F. A.---65, 134,418 Leslie, P. H.--251. 257
Grout, R, A.-ISS, 198. 41 If'-. Leverton, R.--418
Grove, L. C.-96, 118 Lewontin, R. C.-256
Li, H. W.-337
Haber, E. S.-198, 311, 379 380 Lindstrom, E. W.~90. 198,228.231,256
Haenszel. W.-256,. 257 Link, B. F.-275, 298
Haldane, J. B. S.-241, 256 Lipton, S.-3.80
Hale, R. W.-138 Liu, T. N.-337
Hall, P. R.-481, 503 Lord, E.-120-12.1 128. JJ4, 553-554
Hamaker, H. C.--41S Lowe, 8.-258, 298
Hansberry,1. R.-152, 111,219,226,268, LUsh. J. L.-186, 198
298
Hansen, M. H.--534, 539 MacArthur, J. W.-27. 31
Harris, J. A.-296, 298 McCarty. 'D. E.-5J9
Harrison, C. M.-J34 McPeak, M.-539
Hartley. H. 0.-90, 227, 280, 298, 471 Madow. W. G.-539
Hasel, A. A. 539 Magistad. O. M. -65
Healy, M.-338 Mahaianobis, P. C.-·-90, 414, 418
Hess. 1.--539 Mann, H. 8.--130,134,555
Hiorns. R. W.---470, 471 Mantel. N.-256, 257
Hodges. J. L.-134 Martin, W. P.----416. 4lS
Hoe), P. G.-)) Maxwell, A.. E.---4VA
Holmes. M. C.-247 May, J. M. --568
Holzinger. K. 1.-295, 298 Meier, P.-438, 446
579
Meng. C J.-337 Scattergood. L. W.-539
Merrington. M.-S49. 567 ScheJfe, H.-271, 298, 338
Metzger. W. H.--~t11 Schlottfeldt. C. 5.-338
Mitchell, H. H.-96, 118 Serfling. R. E.-539
Mitscberhch. A. E.-447. 471 Sheppard, W. F.--83. 90
Molina, E. C.-227 Shenru>n, I. L.-539
Monselise. S. P.-337 Shine, C.-29I. m
Mood. A. M.45. 134.418 Silver. G. A.~S03
Moore. P. G.-122, 134 Sinha. P.-38O
Moriguti, S.-284. 298 Siorum. M. J.-539
Mosteller, F.-226. 328. 338 Smimov. N. Y.-90
Mumford, A. A.-198 Smith, A. H.-liS, 459, 471
Murphy. D. P.-218. 226 Smith, C. A. 8.---4IS
Smith, C. E.-256
Newman. 0.-273.274.298.427.442 Smith. G. M.·-412, 446
Newman, H. H.-29S, 298 Smith. H.---4IS
Neyman. J.-·27. 31. 113, 119 Smith. H. F.---42S, 446
Smith, S. N.-I04, liS
Ostle. 8.·-549 Snedecor, G. W.-31, 152. 171, 198, 240,
Outhwaite, A. D.-338 256,265.298,379,380,503
Snedecor. J. G.-28
Park. O. W. -118 Snell, M. G.-198
Pascal, 8.--204.206,207 Spearman, c.":" 194, 198
Patterson, H. D.-468; 471. ~I. SOl Sprague; G. F.-446
Payne, S. L.-539 Stephan, F. F.-539
Pearl. R.-----449. 471 Stevens, W. L.---468. 470, 471. 492,503
Pearson. E. 5.,-65. 90. 227. 280. 298.471. Stewart, R. T.-171
552 Strand, N. V.-25O. 257, 503
Pearson, K..--20. 21. 21. 31, 88, 124, 164. Stuart, A.-I34, 198. S39
171. 172, 175. 196. 197.246.257 Swanson. P. P.-1I8, 171,418,446,459,471
Peanan. P. B.-liS
Peebles. A. R. -65 Talley. P. J.45
Penquiac. R.--47l Tam,R.K.45
Pesek. 1.·--418 Tang, P. C.-280, 298
Pillai. K. C 5.-120.134 Theriault, E.l.---471
Pitman. E. J. G.c.:.196. 197. 198 T~om." G. B., lr.-226
Poisson, S. 0.-223, 226 Thompson. C. M,-551, 567
Porter. R. H.-337. 380 Tippett,·L. H. C.45
Price. W. C.---453 Trickett. W. H.-119
Tukey. J. W.-246, 257, 275. 298. 322. Hi-
Rao, C. R..-235, 256, 418 334,337,338
~,I.F.-539
Reed, L. J ..--449, 471 Va>ey, A. ].-331
Richardson. C. H.-152, 111,218-219,226. Vos, B. J. --451
268,298
Richardson, R.-298 Wald. A.-29O. 298
Riedel. D. C.-539 Walker. C. B.-226
Ri8ney, J. A.-539 Walker, R. H.-118
Roberts, H.-I34. 418 Wallace. D. L.-275. 298
Roien, M.-253 Walscr. M.·--446
Rourke, R. E. K.-226 Weich. 8. L.-1I5, 119
Rurtherford. A.--338 Wentz. J. 1.-171
W.,t. Q. M.-·5IS. 539
Soli'bury, G. W.-2IlO Westmacou. M.-338
Sampford, M. R.-539 White, C.-I 30, 131, 134.556
Satterthwaite. F .. E.-338, 380 Whitney, D. R.-13O, 134. 555
Saunden, A. R.-38O Wtebe, G. A.-#, 90
580 Aufitor ......x
Wik:o.on. F.-128-13O. 134.555.556 Yat... F.-119. 247. 257. 265. 298. 337. 338.
Wilko M. 8.-479. S03 342.380.446.471.488. SOl. S03. 539
Williams. C. 8.-330. 338 Youden, W.J.-~4.9S.118
Williams. E. J.-399. 418 Young, M.-198
Williams. R. E. 0.-247 Youtz. C.-3Z8. 338
Willier. J. G.-402. 418 Yule. O. U.-I89. 198
Wilsie. C. P.-380
Winsor. C. P.-298. 330. 338 Zarcovich, S. S.-S39
Woolsey. T. 0.-539 Zelen. M.-492. S03
Wri&ht. E. B.-131. 134 Zoellner. J. A.-418
Wri&ht. S.-418 Zweifel. J. R.-497. S03
Index to numencal examples
analyzed In text

(The index. is arranged by the statistical technique involved. The type of data being
analyzed is described in parentheses.)

Additivity. Tukey's test


Latin squares (responses of monk.eys to stimuli), 335
two-way classification (numbers of insects caught in ligbt trap), 333
Analysis of covarianc~
in one-way classification, computations
one X-variable (leprosy patients, SCores for numbers of bacilli), 422
two X-variables (rate of gain. age, 8lld/weight of pigs). 440
in two-way classification. computations
one X -,'ariable {menta\ activit)' 'SCOtts of students), 426
(yields and plant numbers of corn), 428
two X-variables (yields. heights, and plant numbers of wheat), 444 ____ ---
interpretation of adjustments (per capita incomes and expenditures pCir-pupil in schools),
431
Asymptotic regression, fitting (temperatures in refrigerated hold), 469

Binomia.l distribuiion
fitting to data (random digits). 20S
see a/so Proportions, analysis of
Bivariate normal distribution, illustration (heights and lengths of forearm of men), ',77

Cluster sampling, estimation of proportions (numbers of diseased plants), 514


Components of variance
nested ciassification, estimation of components
equal sizes (calcium contents of turnip greens), 286
unequal sizes (wheat yields of farms), 292
one~way classification, estimation of components
equal siles (calcium contents oftumip greens), 281
unequal sizes (percents of conceptions to inseminations in cows), 290
Correlation
comparison and combination of ,'s and r~z transformation (initial weights and gai,!s in
weight of steers), 187
computation and test of r (heights of btotbers and sisters), 172
intracJass. computations (numbers of ridges on fing.!fs of twins), 295
partial. computations (age, blood pressure, and cholesterol level of women), 401
rank correlation coefficient, computations (rated condition of rats), J94

Discriminant functidn, computations (prtsence or absence: of Alotobacter in soils), 416


$11
582 ""'..x 10 Numerical hample. Analyzed itt T"xl
Exponential growth curve, filting (weights and ages of ctw:kcns), 450

F~tori8.1 experiments. analysis


2x 2, i.nteraction absent (riboflavin conctntration of collard leaves), 343
2 )( 2. interaction present (gains in weight of pig~), 346
J x 2. (gains in weight of rats), 347
2 x .2 x 2 and 2 )( 3 x 4 (gains in weight of pigs). 359, 362

Kurtosis, lest of (numbers of inhabitants of U.S. cities). 87

Latin square
analysis (yields of miUet for different spacings), ~ 13
missing value, e:'limation (milk yields of ~ows). 272
Least s'lgniftcant ~)f'feTen,e lLSD) \~oughnu\'S.). 2n

Moan
computation from frequency distribution (weights of swine), 82
estimation and confidence interval (vitamin C content of tomato juice), 39
Median, er.limation and. confuknce interval (days from calving to oestrus in cows), t23
Missing values, estimation and analysis
latin square (milk yields of cows), 319
two-way classi~ation (yields of wheat), 318

Nested (split-plol) design. a'nalysis (yields of alfalfa), 371


Nested da£sifications. analysis for mixed etfo;ts model (gains in weight of pigs). 289.- St'e
also Components 0( variance.
Newman-Kculs test (grams of fat absorbed by doughnuts), 213_
Normal distribution
confidence interval for mean In- unknown) (vitamin C content o(tomato juice), 39
tests of skewness and kurtosis (numbers of inhabitants of U.S. cities). 85--87

One-way c1assi~tion. frequencies


examination of \fariation he:~ween and within classes {T\umbers ("I( inse(:t lar1Jae OR ~al1-
bages\.234 --....
1.tSt of equality of frequendes (random digits), 232
lest of estimated frequencies (numbers of weed seeds in meadow grass), 237
test of specified frequencies (Mendelian) (color of crosses of maile), 228
One-way classification, measurements. Set' also Components of variance.
analysis of variance
more than two cla.sses (grams of fat absorbl.:d by doughnuts), 259
samples of unequal sizes. (sut'1Jival times of mic~ with typhoid), 278
two classes (comb weights. of chickens), :!67
standard error of comparison among class lT1ean.~ lyieh.h of sugar), 269
Ordered classifications, analysis by assigned scor~s ihealth stalus and degree of infiltration of
leprosy palienls), 245
Orthogonal polynomials, tilting (weights of chl.:k embryos), 461

Paired samples . ..:omparis.on of means


meas_urements (iesions 'On tobacco leaves.l, 9S
proportions (diphtheria bacilli on throatS of p<.Ilicnts), ~13
Partitioning of T realmenls sums of squares
(area/weight ratio of leaves of citrus trelS), 309
by orthogonal polynomials (yields df sugar). 350
in factorial experiment (gain!. in weight ofraI5), _~49
hoybtan ~s, failures to g,erminate). )0%
Perennial experiment.. analysis (weights of ilsparagus~, 3-78
583
Poisson distribution
fitting (weed seeds in meadow grass). 224
homogeneity tests (deaths of chinch bugs under exposure to cold), 242
test of goodness of fit (weed seeds), 237
variance test (random digits), 232
Proportions. analysis of
confidence interval (fields sprayed for corn borer). 5
in one-way classification. see Two-way classification. frequencies
in two-way classification
2 x 2 table (percent survival of plum root-stocks). 495
2 x 3 table (percent of children and parents with emotional problems), 497
R x C table (in logs) (death rates afmeo by age and numbers of cigarettes smoked), 498

Range
analog of I-test (numbers of worms in rats). 121
estimation of (J' from (vitamin C content of tomato juice), 39
Ranks
signed ranI. le,,1 (Wilcoxon) (lengths of corn seedlings). 129
two-sampk sum of rallks test (Mann-Whitney)
~qual sizes (numhcrs of borer eggs on corn planls). 13()
unequal sizes (survival iimes of cats and rabbits). 131
Rank correlation coelllcient (rated condition of rats), 194
Ka!ios. estimation (Silt!S and corn acres in farms), 168
Regression
comparison of "between classes" and "within classes" regressions (S\:ores for bacilli in
lepros) p .. lients). 437
comparison of regression in two samplc-s (age and cholesterol'concentration of women),
4JJ
filled to treatmenl ml',WS (yields of mille!). .114
lilting of Imear
(age and blood pn:ssurc of women). 136
(percent worm~ fruits and size of crop of apple tr«s), 150
fitling of quadratic (protdn content and yield of wheal), 454
multiple. tilting for ~ and 3 X-v,.malcs (phosphorus conrents o{soils). 384,405
lest for Imear tn:nd In Pfl)POrtl(ll1~ !leprosy patients), ::47
test of intercept (~pced ;Ind draft of pluu,!!.ps I. 167
tc-;t or linearil) (suT\,j\al time of cats with ()u.rbain), ~~/:(
Rejcdi(lIl of oh~en-alions. app!k:allon of ruk ()icldS'of wheat). 318
Rc~pon~e curves. two-factor c:>.perimenl~ /)icld" of cowpea hay). 352
Re"ponst wrface. fitting (a~corbic at:ld lOntcnt of snaphcans). 354

Sampk "it.: c~til11dtion fyidd" of wheal). 417


III IW{l'~la,g~ .'>amplln,g (percenl of su,gar in sugar-becls). 517
Scrie. of experimenb. analy"l~ (numhers of soybean planh). 37i
Seh of .2 )( ::. table". anal)sis (prohlcm children in "chQol and prC\iou~ I.nfant losses llf
mothers). 253
Sign test (ranking of beef patlics). 126
Skewne_~_~, test of (m•. nbers of inhabilanh in U.S. cillt!1}. 85
Split-plol ex.periment. analysis (yields or alfalfal. 311
Standard dc\ lation. computation (vitamin C ,,:onten! of l()mato juice~. 39
from frequenc) distribution {weights of ~wincl. K:!
Slratilied random sampling
opllmum alllXalion (numbers ot's(uden[s in colleges). 524
standard error of mean <wheat yields), 522
with attributc,- and proportions (numbers ofvegetablc gardens). 5Z7
584 .....x 10 Numerical Example, Analyzed in rext
Student's Hcst
in independent samples
equal sites (comb weights of chid:.ens). 103
unequal sizts (gains in weight offals}. 105
in paired samples <numbers of lesions on tobacco [eave!>./. 95
Studentized Range test (dou~nurs), 27J

Transformations
arcsin (angular) (percent unsalable ears of corn). 328
logarithm (numbers of plankton caught by nets). 329
square roots (numbers of poppies in oats). 326
Two·way classification. frequencies
heterogeneity Xl. test of Mendelian ratios (numbers of yellow seedlings of corn), 248
test (or a linear trend in proportions (leprosy patients), 247
2)( 2 table (mortality of pipe smokers and non~smokers). 216
2 )( C table (health status and degree- of infiltration of leprosy patients), 239
R x C table (tenure status and sOH type of farms), 250
Two~way classification, measurements
unequal n\.1mbers per sub-class
analysis by proportional numbers (dressing percents of pigs), 480
approximate analysis by equal numbers and by equal numbers within rows (survival
times of mice with typhoid), 476, 478
aproKimate analysis by proportional numbers(tenure status and soil class offamls), 482
(artificial data to illus1. 'ate complexities). 471
least sqiT,ues analysis. 'x 2 table (comb weights of chickens). 483
least sqlJares analysis;.K x 2 or 2 x C table (gains in weights of rats), 4&4
least sqllares analysis. R x C table (mice), 489
usual analysis. standard errors of comparisons, and partitioning of Treatments sum of
squares (failures to germinate of soybean seeds), 300. 301, 308

Variance
Bartlett'S tesl of equality (birth weigtus of pigs), 297
confidence interval (vitamin C). 75
test of equality of 2 varia noes
tndeQendent samples (concentration of syru? by bees). I! 7
paired S#lmples (heights and,_)eg lengths of boys), 197
SUbject index

Abbreviated Doolittle method, 403 model J. fixed eft'ects. 275


Absolute value, 44 model n. random effects, 279-285, 289-
Addition rule 291
in chi-square, 73 samples of unequal sizes, 277-:-278
in Poisson distributi· n, 225 in two-way cJassifications, 299-J(}7
in probability, 200 latin squares, 312-316
sums of squares and degrees of freedom. objectives, 259-260
307-310 .'.' _ .~-~ /..:1 \: , partitioning (subdividing) sums of
Additivity squares, 308-310, 348---349
in factorial experiments, 345 perennial experiments, 377-379
in Latin square. 313 series of experiments, 315--377
in twei-way classification. 302 split-plot (nested.) experiments, 369--375
test of, 331-337 A.ngular transformation, 327
Adjusted mean, 421. 429 Arcsin transformation. 327
Allowances, 5% risk. 275 Area sampling, SIt
Analysis of covariance. 419 Arra),.4O
computations, 421-425 AsymPtotic regress~on. 448
efficjency. 423-424 method ofbttibg, 467-471
in one-way classification, 421--425 Attribute. 9
in fwo-way classification. 425--428
interpretation of adjusted means. 429-432
model,419 ftalancing in experimental design. 4.18
multiple l\artlett's test of homogeneity of variances.
in one-way classification, 438--443 2%-298
in two-way classification, 443-446 Ikhrens-Fisher test, 115
test of adjusted means, 424-425 ftias. 506
test of linearity of regression, 460 precautions against. 109-11.
uses, 419---421 unbiased estimate. 45-46
Analysis of variance, 163 1\imodaJ distribution. 124
effects of non-conformity to modeJ, 32J 8inomial distribution. 17. 3(}.202
336 campa "ton of two proportions. 213-223
non-additivity, 330-331 £onfidence intervals. 5-7, 210-211
non-independence in errors, 323 fitung to data. 205--207
non-normality. 325 formula for frequency. 17. 202-20S
unequal error variances. 324 mean, 207-209
factorial experiments. 339--369 normal a:ppro~mation. 209-213
in linear regression, I60-J63. 314-316 standard deviation, 207-209
in one-way classifications. 258--268 table of confidence intervals, 6-7
effects of errors in u5umptions, 276- test ofa binomial propOrtion. 211-213
J77 1rs1l>f~ I>f ~~ ~:D'

585
5'6 Sal>jecl ",..
variance tcst of homogeneity. 240-242 Compound interest law. 447
Bivariate normal distribution. 177-179 Confidence intervals. 5-7. 14-15. 29
Blocks. 299 for an individual Y.givenX.155-157
efficiency of blocking. 31 J for binomial proportion. 210-211
for components of variance. 284--285
Cas<stwly, 152 for correlation coefficient. ISS-- 1SS
CentralliPlil theorem. 51. 209 for partial regression coefficients. 391
Chi-square (X'), 20-26, 30, 212 for population mean (a known). 56
correction for cot1.1inuily. 125. 209-210 for population mean (0" unknown). 61. In
distribution of, 22-26. 73 for population median. 124--125
in goodness of fil tests. 236-238 for population regression line. 153" 155
in R x· (contingency tables. 250-253 for population variance. 74--76
in tests of Mendelian ratios, 228--231, for ratio of two variances. 197
248-2$0 for slope in regression. 153
in 2 )( C contingency tables. 238-240 one-sided, or one-tailed. 57
in 2 x 2 contingency tables, 215-220 table for binomial distribution. 6 7
in variance test for binomial. 240-243 upper and lower. 58
in variance test for Poisson. 231-233 Confidence lim..its. 5-7. S(>t> a/so Confidence
normal approximation to. 233 intervals,
relation to distribution of sample vanance upper and lower. 58
s2,73-74 Contingency table
table of, 5~551 R x C, 2~252
test of binomial proportion. 20-22. 213- 2 x C. 23&-243
214 2 x 2,215-223
Class sets of 2 x 2 tables. 253-256
interval. 23 Continuity correction. 125. 209-210. 230-·
mark. 67. 73. 82 231
Cluster Sampling. 51) Continuous distribution. 23
formulaS in simple cluster sampling. SI3- Correction
515 for continuity. 125.209-210
Coding. 81 for finite size of population, S13
Coefficient of variation. 62 for ·mean. 261-262
Common elements, 18 t for working mean. 41-48
Comparison Sheppard's. 83
among more than two means. 268--275 Correlation
definition. 269 and common elements. 181-183
of all pairs of means. 211-275 ca1culation in large sample, )90-193
of mean scores. 244-245 coefficient. 172
of observed and expected frequeocies combination of separate estimates, 187
more than two classes. :!28-238 comparison of several coefficients. 186
two classes, 20-27 confidence interval for. 185
of two means in independent samples. tabies. 557-559
IOO-Wl, 114-116 tests of signi&;ance. 184-188
of two means in paired samples. 93-95, intracIass. 294
97-99 multiple. 402
of two proportions in independent sam- nonsense. 189
pIes. 215-223 partial,400-401
of two proportions in paited samples. rank. 19:3-195
213-215 relation to bivariate nonnal distribution.
orthogonal. 309 177-179
rule for standard error. 269.301-302 relation to regression. 175-177
Components of variance. 280 role in selection. 189
in factorial experiments. 364-369 role in stream extension. 189
in three-stage sampling. 285-288. 291-294 utility of. 188-190
in two-stage sa~pling. 280-285. 2R9-291. Covariance. 181. St>e also Analysis of co-
529-5l3 ,variance.
confidetlce lim!ts, 284-2&5 Curve fitting. 447-471
58T
Degrees of freedom. 4S analysis of covariance. 423-424. 427
for chi-square Latin squares. 316
in contingency tables. 217. 239. 251 randomized blocks. 311
in goodness of fit tests. 237 range, 46
in tests of homogeneity of variance. 297 rank tests. 132
in analysis of variance sign test, 127
Latin square. 314 Equally likely outcomes, 199
one-way classificalion. 261 Error
two-way classification, 301. 307 of first kind (Type I), 27. 31
in correlation, 184 o( measurement
in regression. 138, 145, 162-163.385 effect on estimates in regrel>sion. 164-
Deletion of a variable. 412 166
Dependent variable in regression. 135 of second kind. (Type II). 27. 31
Design of investigations regression. 421
comparison of paired and independenl standard (See Standard error.)
samples, 106-109 Estimate or estimator
efficiency of blocking, 311-312 interval. 5. 29
factorial experiments, 339-364 point. 5. 29
independent samples, 91. 100--106. 114-- unbiased. 45. 506
116.258-275 Expected numbers. :W. 216. 228··240
Lalin M,Juares. 312-317 minimum size for / h..'Sts. 215. 241
Missing data. 317-321 Experiment. St'!' Design of investigation~.
paired samples. 91-1)9 Experimental sampling. used to illustrate
perennial crops. 377- 379 binomial confidence limits. 14
randomized blocks or groups. 299 -31 0 binomial frequency distribution. 16
role of randomization. 109--111 central limit theorem. 51 ;5
sample size, 111-114.221-223 chi-square (I dj:) for binomial. 22-26
sample surveys. S04 confideDl,:C interval for population mean
series of experiments. 375 -377 p.78-79
two-stage (spiit-plot or. nested) designs, distribution of sample means from a nor·
369-375 mal distribution. 70- T2
use of covariance. 419- 432 distribution of iample stand.lrd deviation
use of regression. 135 s.72-73
Deviations distribution of sam~ variance .\.1. 7:?:-B
from sample mean. 42 F..distribution. 266
VigilS r-distributlon. 77-78
random. 12 Exponential
table of. 543-546 decay curve, 447
Discrete dislribution. 16 growth curve. 447, 449.:4.53
Discriminant function. 414 Extrapolation. 144. 4S6
cumputations. 416-4 J8
relation to m'ultiple regression. 416 F...distribution. 117
u!.eS.414 effect of correlated errors. 323
Distance between populations. 415 eRect of heter~eneous errors. 324
Distribution. See also the specifw distribu- effect of non~normality. 325
tion. one-tailed tables, 560-567
binomial. 17 two-tailed table. 117
bivariate normal. 177 Factor. 339
chi-square. 73 Factorial experiment. 339
F fV;1riance ratio). 117 an.liysis of:P factorial
mu1tinornial. 235 interaction absent. 342-344
normal. 32 interaction present. 344-346
Poisson. 223 analysis of 2J factorial. 359-361
Student's '-. 59 analysis of general three·factor experi·
Dummy 'Iariabie. 416 ment.361-364
analysis of generallwo~factorexpcriment,
Effidency of J46 349
588 Subject Index
compared with single·factor experiment, in analysis of variance, 323
339-342 in binomial distribution, 201
fitting of response curves to treatments, in plobabiJity, 201
349-354 with attributes, 219
fitting of response surface, 354--358 Independent samples
Finite population correction. 513 comparison of two means, 100-105, 114-
First-order reaction curve, 448. See also 116
Asymptotic regression. comparison of two proportions, 215-223
Fixed effects model Independent variable in regression, .135
in factorial experiments, 3M-369 Inferences about population, 3-9, 29, S04-
in one-way classification. 275 505. See also Confidence intervals.
Fourfold'{2 x 2) ta~le, 215 Interaction, 341
Freedom, degrees of. See Degrees of free- possible reasons for. 346
dom. three-factor, 359-364
Frequency in contingency ta bles. 496
class, 23 two-factor. 341-349. 473
cumulative. 26 Interpolation in tableS. 541
distribution. 16, 30 Interval estinate, 5, 29. See also Confidence
continuous, 23 interval.
discrete. 16 Inlraclass correlation, 294-296
number of classes needed. 80--81 Inverse matrix, 389, 403. 409-412
expected, 20 Kenda]J's t, 194
observed. 20 Kurtosis, 86
effect on variance of .f2. 89
91 and gl tests for non-normality. 86-87 test for, 86--88
Genetic ratios table. 552
tests of. 228-231, 248--249
Geometric mean. 330 Latin square. 312
Goodness of fit test, 1 2 , 84. See also Chi- efficiency, 316
square. model and analysis of va'riance, 312-315
Graphical representation. 16. 40 rejection of observations. 321-323
Grouping test of additivity, 334--337
loss of accuracy due to. 81 Least Significant difference, 272
Growth curve Least squares. method of, 147
exponential. 449 as applied to regression. 147
logistic, 448-449 Gauss theorem. 147
~n two-way tables with unequal numbers.
Harmonic mean. 475 483-493 '. 'A'
Heterogeneity Level of significance. 27
chi-square, 248 Limits, confidence. Se~ Confidence intervals.
of variances. 296. 324 likelihood. maximum. 495
Hi.erarchal classifications, 285--289 Linear calibration. 159-160
Histogram, 25 '" Linear regression. See Regression
Homogeneity, test of Listing, 509-511
in binomial proportions, 240 Logarithm
in Poiss(,m counts, 231 common and natural. 451-452
in regres'sion coefficients, 432 Logarithmic
of between- and within-class regressions, graph paper. 450. 45:!
436 transformation. 329 .no
Hotelling's ~-test, 414, 417 Logistic growth law. 448--449
H ypolheses about populations, 20. See logit transformation. 494. 497~503
Tests of significance. Lognormal distribution,- 276
null. 26, 30
tests of Main effect. 340-342
Main.plot, 369
Independence Mann-Whitney test. 130
assumption of significance levels. I J I. 555-556
589
Mantel-Haenszel test, 255-256 Model. See Mathematical model.
Mathematical model for Model I, fixed effects. See Fixed effects
analysis of covariance, 419 model.
exponential growth curve. 449 Model II, random effects. See Random
factorial experiment, 357, 364--369 effects model.
Latin square, '313 Moment about mean, 86
logistic growth curve. 448-449 Monte Carlo method, 13
multiple regression, 382. 394 Multinomial distribution, 235
nested (split-plot) designs, 370 Multiple comparisons, 271-275.
one-way classification Multiple covariance. See Analysis of co-
fixed effects. 275 variance.
mixed effects, 2S8 Multiple regression. See Regression.
random effects, 279. 289 Multiplication rule of probability, 201
Qrthogonal polynomials. 460--465 Multivariate t-test. 414, 417
regression. 141 Mutually exclusive outcomes, 200
asymptotic, 468
non-linear. 465 Nested.
two-way classific-ation. 302-308. 473 classifications, 285-289. 291-294
Matrix. 390 designs, 369
inverse. 390, 409, 439, 490 Newman·Keuls test, 273-275
Maximin method. 246 Non-additivity
Maximum likelihood. 495 effects of in analysis of variance, 330-331
Mean removal by transformation. 329, 331
amo\utt tat'f\a\\()\\, 4d. \t'!.\'!. fN
adjusted. 421. 429 in Latin square. 334-337
arithmetic, 39 in two-way classification. 331-334
correction for. 261-262 N on-parametric methods
distribution of, 51 Mann-Whitney test. 130
geometric. 330 median and percentiles, 123-125
harmonic, 475 rank correlation. 193-195
weighted. 186,438.521 sign test, 127
Mean square, 44 Wilcoxon signed rank test. 128
expected value Normal distribution, 32
in factorial experiments, 364-369 formula for ordinate. 34
with proportional suh-class numbers. mean, 32
481-482 method of fitting to observed data. 70--72
Mean square error reasons for use of. 35
in sampling finite populations, 506 relation to binomial, 32, 209-213
Measurement data, 29 standard deviation, 32
Median 123 table of cumulative distribution. 548
calculation from large sample, 123 table of ordinates. 547
confidence interval. 124-125 tests of nonnality, 86-88
distribution of sample median. 124 Normal equations. 383
Mendelian inheritance in multiple regression, 383. 389. 403
heterogeneity X2 text. 248-249 in two-way classifications. 488--491
test of specified frequencies, 228-231 Normality. test of, 84-88
Missing data Null hypothesis, 26, 30
in Latin square, 319-320
in one-way dassification. 317 One-tailed tests. 76-77, 98-99
in two-way classification, 317-321 One-way classification, frequencies
M itscher1ich's law. 447. See a/:"o Asymptotic expectations equal, 231-235, 242-243
regression. expectations estimated. 236-237
Mixed effects model, expectations known. 228-231
in factorial experiments. 364-369 expectations small. 235
in nested classifications. 288-289 One-way classification, measurements
Mode. 124 analysis of variance. 238-248
590 SubiecllnJex
comparisons among means, 268-275 256
effects of errors in assumptions. 216-277 estimates of variance. 101-IOJ
model L fixed effects. 275 of classes for X2 tests. 235
model II. random effects. 279-285. 289- regression coefficients, 438
291 PopUlation. 4. 29. 504-505
rejection of observations. 321-323 finite, 504-,505. 512 513
samples of une4ual sizes. 277-278 sampled. J 5, 30
Optimum allocation target. 30
in stratified sampling. 523-526 Power function. 280
in three-stage sampling. 533 Primary sampling units. 528
in two-stage sampling. 531- 533 Probabilit)
Ordered ti:lassifications simple rules. 199-202.219
methods of analysis. 243-246 Probability sampling. 508--509
Order statistics. 123 Proportional sub-<:iass numbers. method of.
Orthogonal comparisons. 309 478-4KJ
in analysis of factorial ex.periments. 346-~ Proportions. analysis of
-361 in one-wa} cI.Jssificl:ltions. 240--243
Orthogonal polynomials. 349- -'51. 460-464 test for a Imear trend. 246---24M
tables of coefficients (value!.), 351. 572 in tWO-Win t:lassifications. 4Y3
Outliers (suspiciously large deviations) in angular (arcsin) scale, 496
in analysis of variance. 321 in log.it sl.:al~. 497 503
in regression. 157 in original (p) scale. 495- 497
in setS of ~ x .2 tables. 153- 156

Paired !.amples. 91 Random digits (numbers). 12--13,30


comparison of means, 93--95. 97 -99 table. 543 546
comparison of proportions. 213-215 Random eft't!cls modcl
,,"aDditions suitable for pairing. 97 in factorial experiments. 364- 369
self-pairing. 91 in one-way classification. 279-294
versus independent samples. IU6-IOS Randomization. 110
Paraholic regression. 453--456 as precaution against bia~. 109 j II
Param.:ter.32 Randomization test (Fisher's). 133
Partial Randomized bloch. 299. Sl't' also Two-
correlation. 40() way classifications.
coefficient. 400 efficiency of blocking. 311
regression coefflcieni. 382 Random sampling. 10-11. 30
interpretation of. 393-397 stratified. II
standard. 39M with replacement. II
Pascal's triangle. 204 without replacement. II. 505
Percentages. analysis of. Set!' Proportions. Range. 39
analysis of. efficiency relatl\'e to standard deviation,
Percentiles. 125 46
estimation b) order statistics. 125 relation 10 standard deviation, 40
Perennial experiments. 377-379 StudentlJ_ed Range test. 272-273
Placebo. 425 I-test based on. 120
Planned comparisons. 268-270 table~, 55.'-,554
Point estimate. 5. 29 use in comparison of means. 275
Poisson distribution. 223-226 Rank correlation. 193-195
. fitting to data. 224--225 Ranks. 11K
formula for. 223 efficienc), relative to normal tests. 131
test of goodness of fit. 236-237 rank sum test. ]3()..-132
variance test of homogeneity. 232 -236 signed rank test. 128-130
Polynomial regression or response curve. Ratio
349-354 estimah:'s in sample surveys. 536-537
Pooling (combining) estimation of. 170
correlation coeflkients. 187 standard error of, 141. 515. 537
es~im<lted ditferences in 2 )( 2 tables. 254-- Rect .. n~uJar t uniform) distribution ~ I
591
Rel.:tifkatioJl.449 in analysis of vi.lriance. 3:!1-·323
Regression. 135 Relati\'C amount of information. 311
analysis of variance for. 160-163 Relatiw t!fficiency. 46
coefficient (s!ope). 136 of range. 46
interval estimate of. 153 Relative ratc of increase. 450
\'atuc in some simple cases. 147- L48 ReplicalloR\o,. 299
comparison of "between classes" and Residuals, 300-- 30 I. 305 307
"within classes" regression~. 436-·438 Response curve
comparison of regression lines. 432--436 polynomial. 349-351
confidence interval for slope. 153 Response surface, 346
deviations from. 138 example of fitting, 354-358
effects of errors in X. 164-166 Ridits,246
estimated regression line. 144-· 145 Rounding errors. 81
estimated rcsiduul variaru.·c. 145-146 effect on accuracy of X and s. 81
estimates in sample surveys. 5.H··538
equation. 136 Sample. 4, 29
historical origin orthe tcm1. 164 cluster. 511. 513-515
in one-way classification of frequencies. non-random, 509
234 probability, 508-509
line throuth ori~in, 166-169 random, to--l L 30, 505, 5 t t
linear regression of proportions, 246-248 stratified random. 507. 520--527
mathematical modd. 141-144 systematic. 519
multiple, 381 Sample mean. X. 39
computations in fitting, 383--393. 40}·· calculation from a frequency distribution.
412 ~0-83
deletion of an independent variable. 412 frequency dislrihution of. 51
dc\ iations mean square. 3~5-389 Sample standard deviation .\", 44
~ffec!s of omitted variables. 394-397 Sampling fraction, 512
importance of different X-variables, unequal, 507
398 400 Sampling unit. 509
interpn:talion of coefficients. 393-397 Scales with limited values. 132
partial regression coefficient. 382 Schetfe's test. 271
prediction of individual observation. Scores
392 assigned to ordered classific(:llions. 244
prediction of population line, 392 246
purposes. 381 Selection of candidates, 189
selection of variates for prediction. 412- Selection of variates-for prediction;!12-4f4
414 Self-pairiRg,. 91, 97 ," .
standard error of a deviation. 392 Self-weighting estimate. 521 -
stand<..lrd error... of regression cO(:ffi- Semi-logarithmic graph paper, 450
cient~. 391 Series of experiments. 375-377
testing a dt!\iation. 392-393 Sets of 2 x 2 tables. 253-256
tests of regrcs!.ion coefficients. 38tt---3MM Sheppard's correction. 83
nun-linl.!<..Ir in ~ome parameter!.. 465-471 Sign"iest.125-127
general method of fitting. 465-467 efficiency of, 127
parabolic. 45.3-456 table of significance levels. 554
prediction of i.ndi.vidual observation. '55- Signed rank test. \ 28
157 significance levels. 129. 555
prediction of the population line. 153 Significance
prediction of X from Y.IS9-160 level,27
relation to correlation, 175-177. 188-190 tests of (See Tests of significance.)
shortcut computation. 139 Simple random sampling. 505-S07
situation when X varies from sample to of cluster units. 513-515
sample. 149-150 properties of estimates. 511 ·515
testing a deviation, 157 ~158 size of sotmple. 516- 518
tests for linearity, 453-459 Size of !!ample
Rejection of observations for comparing two proponiofls, 221-222
592 5<I"i." W."
for estimating population mean, 5R equal within rows, 477
for tests of signi(icance when comparing proportional,418
means, 111_114 unequal, 472
in sampling finite: populations, 516-518 Sub·plots, 369
in two-stage (nested) sampling. 281 Sub-sampling. See Two·stage sampling.
within strata, 523-526 Sum
Skewness, n of products, 136
test of, 86 correction for means, 141
table, 252 of squares, 44-45
Smoothing, 447 correction for mean. 48-49
Spearman's rank correlation coefficient, 194 Systematic sampling, 519
Split-plot (nested) Jesign, 369
analysis of variance, 370-373 t (Student's t.distribution). 59
comparison with randomized blocks. 373 tablc, 549
reasons for use, j69-370 Tests of significance, 26--30
Square roots goodness of fit test,"f, 84-85
method of finding, 541 in analysis of covariance, 42J- 425
table, 573-575 in R x C contingency tables. 250--252
Square root transf(1nnation, 325-327 in 2 x C contingency tables. 238-243.
Standard deviation 246-249
of estimatl!s from data binomial proportion. 26-28, 211-213
adjust~ difference. 423 aU differences among means. 271·-275
difference, 100, 104, 106,115, 190 correlation coefficient. 184-188
01 for skewnesS, 86 difference between means of independent
91 for kurtosis. 87 samples, 100-105, 114-116
mean ofrandorn sample, 50, 512 difference between means of paired sam"
median, 124 pies, 93-95, 97-99
popUlation tottl.l, 51, 513 difference between two binomial propor'
regression coefficient, 138, 391 tions, 213-221
sample total, 51 equaHty of two correlated varianl'es, 195r
sum, 190 197
transformed correlation. 185 equality of two variances, 116
variance, 89 goodness of fit of distributions. 236-237
of population homogeneity of Poisson samples, 232-236
binornla),207-D homogeneity 01 varlances. 296-2913
normal, 32 linear trend in proportions. 246-148
Poisson. 225 linearity of regression. 453-460
Standard en . . .r. SO. See also Standard devia· mUltiple correlation coefficient. 402
tion, rank correlation coefficient, 194-
Standard normal deviatc, 36 single ctassifi.~ion with estimated fre-
Standard normal vilriate. 36 quencies, 136-238
Standard partial regression coefficient, 398 single classification with equal frequen-
Step up and step down metlwds, 413 cles,231-234
Stratified random sampling, I I, 507, 520 single. cia$Si.fk:ation with specified fre~
for attributes. 526---527 quencies,228-231
optimum allocation, 523-526 HeSt based on' range, 120
proportional allocation, 521-523 test of skewness, 86
reasons for use, 520 tes'5 of kurtosis, 86--88
Stream extension, 189 Three-stage sampling, 285-288, 533
Structural regression coefficient. 165 allocation of sample sizes, 533
Studentized Range test, 272-273 TfaOlforrnation, 277
shortcut computation using ranges, 275 logarithmic, 329-330
table. 568 logit, 494, 497-503
Student's r-distribution, 59 to remove non-additivity, 331-332
table, 549 to stabilize vaOanct. 325
Sub-class numbers angular (arcsin), 327-329
equal. 475 square root. 325-327
use in fitting non·linear relations, 448-453 Unbiased estimate. 45
Treatments, definition, 91 Uniform distribution. 51
Treatment combination, 340 . in relation to roundins errors, 81
Tukey's tests for additivity, 331-337 Unweighted means, method of. 475--471
Two-stage sampling. 528
reasons for use, 528
Variance. 53
with primary units of equal size. 529-533 analysis (See Analysis of variance.)
choice of sample and sub-sample sizes. comparison of two correlated variances,
531-533 195-197
with primary units of unequal sizes. 534,- comparison of two variances, 116
536 components (Set! Components of vari·
Two-way classifications. frequencies. 238-- anee.)
243 confidence interval for, 74
R x C tables. 250--253 ofdiffe,...." 100, 104, 106, 115, 190
sets of 2 x 2 tables. 253--257 of sum, 190
2 x C tables. 238-243. 246--250 ratio. F. 265
2 x 2 (fourfold) tables. 215---223 distribution under general hypothesis.
Two-way classifications. measurements 280'
additivity assumption, 302. 330-334 table. 560--567
analysis of variance. 299- 30 I test of homogeneity, 296--298
mathematical model. 302-307 Variation. coefficient of. 62
rejection of observations. 321-323
test of additivity, 331-334
with unequal numbers. 472 Weighted mean
comphcations involved. 412--415 i.n stratified, <;amplmg. 52 \
equal weights within rows. 477--418 of differences in proportions. 255
ieast squares analysis. R x C table. 488- of ratios, 170
493 of regression coefficients using estimated
method of proportional numberi. 478-- weights, 438
483 of transformed correlations. 187
R x 2 table. 484-487 Welch-Aspin test. 115
2 x 2 table. 483-484 Wilcoxon signed rank test. 128
unweighted analysis, .75--477
Two-way classifications, proportions Z or z. standard normal variate. 51
analysis in logit scale, 497-50) z-transformalion of a correlation co-
analysis in proportions scale, 495--497 efficient. 1.85
approaches to analysis. 493--495 tables, 55&-559

CAL I

You might also like