Foundation Studies General Mathematics B: Hypothesis Testing Notes
Foundation Studies General Mathematics B: Hypothesis Testing Notes
Foundation Studies
General Mathematics B
Hypothesis
Testing
Notes
Hypothesis Testing
1 Hypotheses
2
(iv) In hypothesis testing, the hypothesis is not accepted or rejected with
absolute certainty, but with a definite level of confidence that the error
in the decision is small.
3
Hypotheses:
H0 : µ = µ 0
H1 : µ ≠ µ 0
H0 : µ = µ 0
H1 : µ > µ 0 , or ,
4
In this case:
H0 : µ = µ 0
H1 : µ < µ 0
5
(xi) Errors
(xiv) Summary
6
decision H0 true H0 false
reject H0 type I error no error made
accept H0 no error made type II error
table 1
Example 1
Example 2
7
(a) A type I error occurs when it is concluded that the
drug works when in fact it doesn't.
2 z-test Statistic
We will deal with the case of a single sample being chosen from
a population and the question of whether that particular sample
might be consistent with the rest of the population. Exactly
which test statistic is appropriate depends on the information
available. However, it is very important that the correct one is
used since the use of an incorrect test statistic can lead to an
incorrect conclusion.
8
x − µ0
z=
σ
n
x − µ0
z=
s
n
σ
The expression is referred to as the standard error of the
n
mean.
(i) The critical value is the value of the test statistic which is
significant. By significant we mean the value that leads
to the rejection of the null hypothesis.
9
There are four cases:
case 1: two-tailed test with α = 0.05
case 2: two-tailed test with α = 0.01
case 3: one-tailed test with α = 0.05
case 4: one-tailed test with α = 0.01
H0 : µ = µ 0
H1: µ ≠ µ 0 α = 0.05
figure 1
H0 : µ = µ 0
H1: µ ≠ µ 0 α = 0.01
0 z scale
-2.58 2.58
figure 2
10
(v) Case 3: One-tailed Test with α = 0.05
(a) H0 : µ = µ 0 (b) H0 : µ = µ 0
H1: µ < µ 0 or H1: µ > µ 0
with α = 0.05
The critical values are zc = - 1.645 for (a) or zc = 1.645 for (b) .
These values are obtained by considering the z-score when 95% of
the region under a normal curve is acceptable, (figure 3):
(a)
- 1.645 0 z scale
(b)
acceptance
region of region of rejection
(0.95) (0.05)
0 1.645 z scale
figure 3
(a) H0 : µ = µ 0 (b) H0 : µ = µ 0
H1: µ < µ 0 or H1: µ > µ 0
with α = 0.01
11
The critical values are zc = -2.33 for (a) or zc = 2.33 for (b) .
These values are obtained by considering the z-score when 99% of
the region under a normal curve is acceptable, (figure 4):
(a)
region of rejection
(0.01)
-2.33 0 z scale
zc
(b)
region of rejection
(0.01)
z scale 0 2.33
zc
figure 4
Example 3
(a) α = 0.05
(b) α = 0.01
12
Solution
(a) H 0 : µ = 150
H1 : µ ≠ 150 α = 0.05
x − µ0
z=
σ
n
which gives:
152.7 − 150
z= 12
100
= 2.25
13
(b) H 0 : µ = 150
H1 : µ ≠ 150 α = 0.01
Since 2.25 is within the region between -2.58 and +2.58 (case 2)
which is the region of acceptance, H0 is not rejected. We can conclude
that the population mean is not different from 150 . The difference
between 152.7 and 150 can be attributed to the variation due to
sampling (chance). We therefore conclude that based on the sample
data we do not reject the null hypothesis. We therefore assume that
the null hypothesis is true.
We did not reject the null hypothesis that the population mean
efficiency rating is 150 , based on sample evidence. However, we did
not prove beyond doubt that H0 is true. The only way to prove beyond
doubt that it is 150 is to check every efficiency rating in the
population - that is, to take a 100 percent sample, which is really a
census.
accept H0
It should be noted that if the z-test statistic for our example had
produced a value that was less than –2.58 or greater than +2.58 (the
critical values) then the null hypothesis would be rejected in favour of
the alternative hypothesis. Also, another thing to remember is that as
the level of significance changes so to has the outcome changed.
14
Example 4
The Myer Department Store issues its own credit card (Myercard).
The finance manager of credit services wants to find out if the mean
monthly unpaid balance is still at $1000 as it was six months ago.
A random check of 172 unpaid balances revealed the sample mean to
be $1017.50 and the standard deviation of the sample $95 . Should
this finance manager conclude that the mean unpaid balance on
Myercards is greater than $1000 , or is it reasonable to assume that the
difference of $17.50 ($1017.50 - $1000 = $17.50) is due to
coincidence (or chance)?
Test the hypothesis that the mean unpaid balance is not different from
the usual amount at:
(a) α = 0.05
(b) α = 0.01
Solution
(a) H 0 : µ = $1000
H1 : µ > $1000 α = 0.05
x = 1017.5
s = 95 (Note, this is the sample standard deviation)
n = 172
µ 0 = 1000
xμ−
z= 0
s
n
$1017.5 − $1000
which gives $95 $17.50
z= = = 2.416
172 $7.2437
15
region of rejection
(0.05)
z scale 0 1.645
critical value 2.42
test statistic
As the test statistic (z) of 2.42 lies in the region of rejection for
the null hypothesis, (i.e. it is greater than the critical value (zc)
of 1.645 , then the null hypothesis (H0) is rejected or the
alternate hypothesis (H1) is accepted.
(b) H 0 : µ = $1000
H1 : µ > $1000 α = 0.01
Example 5
16
Solution
H 0 : µ = 500
H1 : µ < 500
x = 497 µ 0 = 500
s = 20 n = 100
z-test statistic
497 − 500
z= = − 1.5
20
100
region of rejection
(0.05)
-1.645 0 z scale
zc
-1.5 test statistic
Example 6
17
Solution
H 0 : µ = 21.6
H1 : µ ≠ 21.6
x = 24.1
σ = 7.2
n = 25
µ 0 = 21.6
∴ z-test statistic
x − µ0
z=
σ
n
24.1 − 21.6
=
7.2
25
= 1.74
Since z = 1.74 lies within the region –1.96 < zc < 1.96 (case 1).
H0 is accepted.
Example 7
Solution
H 0 : µ = 12.00
H1 : µ > 12.00
x = 13.30
s = 2.5
n = 30
µ 0 = 12
18
x − µ0
z=
s
n
13.30 − 12.00
=
2.50
30
= 2.85
Exercise 1(a)
4 t-test Statistic
x −µ
z= s
n
19
The important to remember that the critical value for a given level of
significance is greater for small samples than for larger samples. This is
because there is more variability in sample means computed from small
samples, therefore we have less confidence in the resulting estimates and
are less likely to reject the null hypothesis.
xμ−
t= 0
s
n
(iii) Unlike the z-test statistic, the t-test statistic has associated with it a
quantity called degrees of freedom. In this case the degrees of freedom
are denoted by the Greek letter v and are defined by v = n -1.
v=n-1
(v) To find tc look down the left-hand side of the row with the appropriate
degrees of freedom, and across the top for the appropriate test (either
one-tailed or two-tailed) and the significance level used.
20
Example 8
Solution
H 0 : µ = 70
H1 : µ < 70
x = 66 n = 22 v = 21
s = 10 µ 0 = 70 α = 0.01
t-test statistic
66 − 70
t= = −1.876
10
22
tc -critical value
One-tailed test
α = 0.01
v = 21 (degrees of freedom)
21
region of rejection
(0.01)
-2.52 0 t scale
(tc) -1.876
test statistic
As the t-test statistic lies in the region of acceptance, we accept the null
hypothesis. Therefore, the cost cutting measures have not reduced the
mean cost per claim to less than $70 based on the samples results.
Example 9
49 50 51 46 48 45 52 47 48
Solution
H 0 : µ = 50
H1 : µ ≠ 50
x = 48.4 n=9 v =8
s = 2.298 µ 0 = 50 α = 0.05
22
t-test statistic
x − µ0
t=
s
n
48.44 − 50
=
2.298
9
= −2.09
tc - critical value
α = 0.05
v=8 (degrees of freedom)
two-tailed test
tc = 2.306
Example 10
Solution
H 0 : µ = 105
H1 : µ ≠ 105
x = 108.6 n = 20 µ 0 = 105
s = 6.3 v = 19 α = 0.05
23
t-test statistic
108.6 − 105
t= = 2.556
6.3
20
tc - critical value
As the t-test statistic lies in the region of rejection, we reject the null
hypothesis.
The components produced on the production line are of a different length
to normal.
(c) Use a decision rule (at the level of significance) to test for
the value of the test statistic.
24
(d) Compare the calculated z or t value and compared it with the
critical z or t value and decide from the decision to either accept
or reject the null hypothesis.
Exercise 1(b)
(ii) The symbols used to describe aspects of each sample is shown in the
table below, (table 2)
Note, the two samples are drawn independently from the population:
sample symbol
1 2
size n1 n2
mean x1 x2
standard deviation s1 s2
table 2
(iii) We wish to examine the difference between the means of the two
samples:
xd = x1 − x2
25
Generally speaking, when two sample means are different, we have two
hypotheses to explore. First, there is the null hypothesis that the two
populations from which the two samples originate have the same mean
( µ1 = µ2 ) . If this is the case, then the observed difference between the
two sample means is not significant and is attributed to chance or
random sampling fluctuations. The alternative hypothesis to be explored
is that the two samples are drawn from populations which have different
means. If this hypothesis is true, the observed differences between the
two sample means is deemed significant.
When two sample means are different, how can we decide whether or not
the difference between the two means is significant? The standard
procedure is to test the validity of the null hypothesis, which states that
µ1 = µ 2 , utilizing the information from the two samples. On the basis of
the evidence produced by the two samples, we will either accept or reject
the null hypothesis. If the null hypothesis is rejected, the observed
difference between the two sample means is significant. However, the
observed difference is not significant whenever the null hypothesis is
accepted.
Symbolically we write:
(iv) We will consider the situation when the sample size is large ( n ≥ 25 ) .
This requires the z-statistic test.
When two samples are large, ( n1 , n2 ≥ 25) and the population standard
deviation, σ , is known, the standard error σ d , (where d indicated
“difference”), of xd = x1 − x2 is given by the expression:
1 1
σd = σ +
n1 n2
26
Note: the population standard deviation for a single sample is given by:
σ
n
When two samples are large, ( n1 , n2 ≥ 25) and the population standard
deviation, σ , is not known, the standard error, sd , of
xd = x1 − x2 is given by the expression:
s12 s2 2
sd = +
n1 n2
(vii) The z-statistic used for one sample hypothesis testing was given by:
xμ−
z= 0
s
n
x for xd = x1 − x2
s s12 s2 2
for sd = +
n n1 n2
µ0 for µ d = µ1 − µ 2
d −
xμ
which gives: z= d
sd
27
Example 11
Solution
H 0 : µ1 = µ 2
H 0 : μ1 ≠ μ2
n1 = 100 n2 = 100
x1 = 47 x2 = 48
s1 = 4 s2 = 3
xd = x1 − x2 = 47 − 48 = −1 and µd = 0 ( i.e H 0 : µ1 = µ2 )
s12 s2 2 42 32
sd = + = + = 0.5
n1 n2 100 100
xd − µ d
Now z=
sd
−1 − 0
∴ z-test statistic: z=
0.5
∴ z = −2
That is, the difference between the means of the two samples is not
significant at the α = 0.01 level.
28
Example 12
Solution
H 0 : μ1 = μ2
H1 : µ1 ≠ µ 2
xd = x1 − x2 = 82.5 − 77 = 5.5
xd − µ d
µd = 0 ; z =
sd
5.5 − 0
∴ z-test statistic: z = = 3.12
1.763
Now zc = 2.575
∴ we reject the null hypothesis. There is a significant difference at
α = 0.01 .
Example 13
29
standard deviation of 2.1 hours. At the .05 level of significance, does
the second drug provide a significantly shorter period of relief?
Solution
H 0 : µ1 = µ 2
H1 : µ1 > µ 2
sd = 0.302
xd − µ d
Now z-test statistic =
sd
0.6 − 0
z= = 1.98
0.302
Exercise 2(a)
30
6 t-test Statistic – two samples
When two samples are small, (n1, n2 < 25) the sample standard deviation,
sd of xd = x1 − x2 is given by the expression:
( n1 − 1) s12 + ( n2 − 1) s22 1 1
sd = +
( n1 + n2 − 2) n1 n2
v = n1 + n2 − 2
(iii) The t-statistic used for one sample hypothesis testing was given by:
xμ−
t= 0
s
n
x for xd = x1 − x2
s
for sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1 1
+
n n1 + n2 − 2 n1 n2
µ 0 for µ d = µ1 − µ 2
31
t= d −
xμ d
which gives:
sd
Example 14
Solution
H 0 : µ1 = µ 2 (two-tailed test)
H1 : µ1 ≠ µ2
sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1 1
+ xd = 1000 − 900 = 100
n1 + n2 − 2 n1 n2
xd − µ d
t–test statistic t=
sd
32
100 − 0
gives: t = = 1.70
58.79
Example 15
Anglo-American Mexican-American
x1 = 74 x2 = 70
s1 = 8 s2 = 10
Is the difference between the mean of the two groups significant at the
0.05 level?
Solution
H 0 : µ1 = µ2
H1 : µ1 ≠ µ 2
( n1 − 1) s12 + ( n2 − 1) s22 1 1
sd = +
( n1 + n2 − 2 ) n1 n2
( 12 − 1) ( 8 ) + ( 10 − 1) ( 10 )
2 2
1 1
sd = +
( 12 + 10 − 2 ) 12 10
= 3.83
33
xd = 74 − 70 = 4
t-test statistic
xd − µ d
t=
sd
4
t= = 1.043
3.83
The difference between the mean is not significant at the 0.05 level.
Example 16
Solution
H 0 : µ1 = µ2 (one tailed)
H1 : µ1 < µ 2 α = 0.01
n1 = 12 n2 = 9
x1 = 27.2 x2 = 32.1
s1 = 3.8 s2 = 4.3
xd = x1 − x2 = −4.9
sd =
( n1 − 1) s12 + ( n2 − 1) s2 2 1
+
1
( n1 + n2 − 2 ) n1 n2
34
sd = 1.77 µd = 0
35
t-test statistic
xd − µ d −4.9 − 0
t= =
sd 1.77
t = −2.76
t–critical value
ν = 12 + 9 − 2 = 19 d.f
tc = -2.539
∴ reject H0 :
Exercise 2(b)
36
7 Hypothesis Testing of Proportions
In each case we have dealt with large samples (z statistic) and small
samples (t statistic).
We will use this normal approximation to the binomial when dealing with
the hypothesis testing of proportions.
µp = p
pq
σp =
n
37
Note: q=1-p
n = number of independent binomial trials
(v) Hypotheses
When dealing with testing a single proportion, the null hypothesis is that
the expected proportion equals the population proportion.
H0 : µ p = p alternatively,
H1 : µ p ≠ p
p− p
z=
σp
p = sample proportion
p = population proportion
σp = standard error
Example 17
38
Solution
H 0 : p = 0.8 alternatively,
H1 : p ≠ 0.8 at α = 0.05
We are to test the expected proportion of the sample against the actual
sample proportion.
0.8 × 0.2
The standard error: σp = = 0.0327
150
p− p
The z–test statistic: z=
σp
0.7 − 0.8
z= = − 3.058
0.0327
z = -3.058
test statistic
39
We reject the null hypothesis at α = 0.05 .
Example 18
Test the null hypothesis that 60% are complying with pollution
standards at the 1% level of significance.
Solution
H 0 : p = 0.6
H1 : p < 0.6
33 27
p= = 0.55 , q= = 0.45 , n = 60
60 60
pq
standard error: σp =
n
0.6 × 0.4
=
60
= 0.0642
p− p
z-test statistic: z=
σp
0.55 − 0.6
z=
0.0642
z = −0.779
40
region of rejection
region
of acceptance
zc = -2.33 0 z scale
critical value z = -0.779
z-test statistic
We accept the null hypothesis, even though the actual sample proportion
is indeed below the expected proportion is indeed below the expected
proportion, it is not significantly below this figure at the 1% level of
significance.
Example 19
The sponsor of a weekly television show would like the studio audience
to consist of an equal number of men and women. Out of 400 persons
attending the show on a given night, 220 are men. Using a level of
significance of 0.01 , can sponsor conclude that the desired sex
composition of the audience is not properly maintained?
Solution
H 0 : p = 0.5
H1 : p ≠ 0.5
220
p= = 0.55 n = 400 q = 0.45
400
standard error: σp =
( 0.5) ( 0.5)
400
σ p = 0.25
p− p
z-test statistic: z=
σp
0.55 − 0.5
z=
0.25
z=2
41
critical value: zc
Example 20
The Department of Health, Education and Welfare reports that only 10%
of all persons over 65 years old are covered by adequate private health
insurance. What would the Australian Medical Association (AMA)
conclude about the Department’s claim if, out of a random sample of 900
elderly persons, 99 possessed adequate private health insurance? Use a
level of significance of .05 .
Solution
H 0 : p = 0.1
H1 : p > 0.1
99
p= = 0.11 n = 900 q = 0.89
900
standard error: σp =
( 0.1) ( 0.9 ) = 0.01
900
p − p 0.11 − 0.1
z-test statistic: z= = =1
σp 0.01
Since z is 1.0 , which is less than 1.64 , the null hypothesis cannot be
rejected using the .05 level of significance. In other words, the AMA
does not have enough evidence to reject the claim made by the
Department of Health, Education, and Welfare.
Exercise 3(a)
42
8 Hypothesis Testing Between the Proportions
(i) In this section we will discuss the difference between the proportions of
two samples.
The mean or expected proportion for each respective sample equals their
population proportions.
μ p1 = p1
μ p 2 = p2
(iv) Hypotheses
H 0 : p1 = p2
H 0 : p1 ≠ p2 (two-tailed) , or
H1 : p1 > p2 or p1 < p2 (one-tailed)
pd = p1 − p2
43
Standard Error
The standard error (standard deviation) of the difference between the two
proportions p1 and p2 is given by:
σ d = σ p1 − σ p2
p1q1 p2q1
σd = +
n1 n2
p1q1 pq
σd = + 2 2
n1 n2
n1 p1 + n2 p2
pˆ =
n1 + n2
(viii) The standard error of the difference between the two proportions using
the overall proportion, σˆ d , is given by:
44
ˆˆ
pq ˆˆ
pq
σˆ d = +
n1 n2
p− p
z=
σp
p for p d = p1 − p 2
ˆ ˆ pq
pq ˆˆ
σ p for σˆ d = +
n1 n2
pd for p = p1 − p2
pd − pd
z=
σˆ d
45
Example 21
Solution
Group 1 Group 2
71 58
p1 = = 0.71 p2 = = 0.644
100 90
29 32
q1 = = 0.29 q2 = = 0.356
100 90
n1 = 100 n2 = 90
H 0 : p1 − p2 with,
H1 : p1 ≠ p2 at α = 0.05
Two-tailed test
n1 p1 + n2 p2
pˆ =
n1 + n2
pˆ = 0.6789 qˆ = 0.3211
46
(b) Standard Error
ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2
σˆ d =
( 0.6789 ) ( 0.3211) + ( 0.6789 ) ( 0.3211 )
100 90
σˆ d = 0.0678
pd − pd
z=
σˆ d
= 0.066
0.066 − 0
∴ z= = 0.973
0.0678
z-statistic = 0.973
The difference between the two sample proportion lies within the
acceptance limits. Thus, we accept the null hypothesis and conclude that
these two drugs produce effects on blood pressure that are not
significantly different, (at α = 0.05 ) .
47
Example 22
Solution
Area B Area B
20 18
p1 = = 0.1 p2 = = 0.12
200 150
180 132
q1 = = 0.9 q2 = = 0.88
200 150
n1 = 200 n2 = 150
H 0 : p1 = p2
H1 : p1 ≠ p2 (two tailed)
n1 p1 + n2 p2
pˆ =
n1 + n2
38
pˆ = = 0.109
50
∴ qˆ = 1 − 0.109 = 0.891
48
(b) Standard Error
ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2
σˆ d =
( 0.109 ) ( 0.891) +
( 0.109 ) ( 0.891 )
200 150
σˆ d = 0.0337
pˆ d − pd 0.1 − 0.12
z= = = −0.59
σˆ d 0.0337
z-test statistic
z = 0.59
49
Example 23
Solution
H 0 : p1 = p2
H1 : p1 < p2 (one tailed test at α = 0.01)
p1 = 0.68 p2 = 0.76
q1 = 0.32 q2 = 0.24
n1 = 200 n2 = 250
n1 p1 + n2 p2
pˆ =
n1 + n2
pˆ = 0.724
qˆ = 0.276
ˆ ˆ pq
pq ˆˆ
σˆ d = +
n1 n2
σˆ d = 0.0424
50
(c) Critical Value
pˆ d − pd
z= pd = 0
σˆ d
−0.08 − 0
∴ z= = − 1.89
0.0424
Exercise 3(b)
9 Chi-Square Analysis
(i) We have investigated hypothesis tests from either one or two samples.
We used one-sample tests to determine whether a mean of a proportion
was significantly different from a hypothesized value. In the case of two-
sample tests, we examined the difference between the two means or two
proportions, to decide whether this difference was significant.
Suppose that in four regions, the National Health Care Company samples
its hospital employees’ attitudes toward job performance reviews.
Respondents are given a choice between the present method, a proposed
new method.
51
The table below, (table 3), illustrated the response to this question from
the sample polled, is called a contingency table. A table such as this is
made up of rows and columns; rows run horizontally, columns vertically.
Notice that the four columns in Table 1 provide one basis of classification
– geographical regions- and that the two rows classify the information
another way; preference for review methods. Table 9-1 is called a “2×4
contingency table”, because it consists of two rows and four columns. We
describe the dimensions of a contingency table by first stating the number
of rows and then the number of columns. The “total” column and the
“total” row are not counted as part of the dimensions.
method region
Northeast Southeast Central Westcoast total
present 68 75 57 79 279
new 32 45 33 31 141
total 100 120 90 110 420
table 3
(iv) Hypotheses
The observed frequencies, f0 , are the actual values obtained, which are
recorded on the original contingency table.
RT × CT
fe =
n
52
where:
For example, the f e value for someone who prefers the present method
in the Northeast region is given by:
100 × 279
fe = = 66.43
420
The table below, (table 4), gives a summary of the observed and
expected frequencies from table 1.
f0 32 45 33 31
new
fe 33.57 40.28 30.21 36.93
table 4
(f −f )
2
χ 2
=∑ 0 e
fe
53
( fo − f e )2
f0 fe f o − fe ( fo − fe )
2
fe
table 5
( f0 − fe )
2
χ =∑
2
= 2.764
fe
v = (r − 1)(c − 1) ,
54
Where r is the number of rows in the problem, and c is the number of
columns in the problem.
Since our contingency table for this problem (table 1) has two rows
and four columns, the appropriate number of degrees of freedom is:
The chi-square tables reveal that the chi-square critical value, with
α = 0.05 and v = 3 degrees of freedom equals 7.81 .
Thus the acceptance region for the null hypothesis in the figure below,
(figure 5) goes from the left tail of the curve to the chi-square statistic of
7.81.
acceptance region
chi-square distribution
for 3 degrees of freedom
sample chi-square
value of 2.764
2.764 7.81
figure 5
55
Example 24
Random samples of 160 , 240 , and 200 persons were selected from
Melbourne, Sydney and Brisbane respectively. The persons selected
were asked “What type of television program do you like best: drama,
western, documentary, or comedy?” The responses are summarized
below:
Solution
(a) Hypotheses
RT × CT
Using the formula: fe =
n
( f0 − fe )
2
χ =∑
2
fe
56
( f o − fe )2
f0 fe fo − fe ( fo − fe ) 2
fe
60 64 -4 16 0.25
100 96 4 16 0.16
80 80 0 0 0
30 24 6 36 1.5
30 36 -6 36 1
30 30 0 0 0
30 32 -2 4 0.125
40 48 -8 64 1.333
50 40 10 100 2.5
40 40 0 0 0
70 60 10 100 1.667
40 50 -10 100 2
total 10.535
∴ χ = 10.535
2
v = (r - 1)(c -1)
= (4 -1)(3 - 1)
=6
χ 2 0.05 = 12.6
acceptance region
0.05 of area
10.535 12.6
2
χ statistic critical value
57
Example 25
Solution
(a) Hypotheses
The table below sets out the observed and expected frequencies
(in brackets):
year A B total
8 22 (25) 18 (15) 40
9 26 (25) 14 (15) 40
10 27 (25) 13 (15) 40
total 75 45 120
( fo − fe )
2
f0 fe f o − fe
fe
22 25 -3 0.36
18 15 3 0.60
26 25 1 0.04
14 15 -1 0.07
27 25 2 0.16
13 15 -2 0.27
total 1.50
58
( f0 − fe )
2
χ =∑
2
= 1.50
fe
v = (r-1)(c-1)
= (3-1)(2-1)
=2
Example 26
For random samples of 200 people contacted in each of six states, the
number who favoured Australia becoming a republic is recorded in the
table below:
Test the hypothesis that people in the six states are equally in favour at
the 5% level of significance.
Solution
(a) Hypotheses
The table below sets out the observed and expected frequencies
(in brackets):
59
preference State total
A B C D E F
yes 132 (120) 108 (120) 128 (120) 104 (120) 128 (120) 120 (120) 720
no 68 (80) 92 (80) 72 (80) 96 (80) 72 (80) 80 (80) 480
total 200 200 200 200 200 200 1200
( fo − fe )
2
f0 fe f o − fe
fe
132 120 12 1.2
108 120 -12 1.2
128 120 8 0.533
104 120 -16 2.133
128 120 8 0.533
120 120 0 0
68 80 -12 1.8
92 80 12 1.8
72 80 -8 0.8
96 80 16 3.2
72 80 -8 0.8
80 80 0 0
total 14
∴ χ = 14
2
v = (r-1)(c-1)
v = (2-1)(6-1)
v =5
Exercise 4
60
61
62