Tutorial 4 - Hypothesis Testing
Tutorial 4 - Hypothesis Testing
2. The standard deviation of the tube life for a particular brand of ultraviolet tube is known to
be 500 hr, and the operating life of the tubes is normally distributed. The manufacturer
claims that average tube life is at least 9,000 hr. Test this claim at the 5 percent level of
significance by designating it as the null hypothesis and given that for a sample of n = 15
tubes the mean operating life was x = 8,800 hr. (Test statistics = -1.549, Do not reject
Ho)
3. As a commercial buyer for a private supermarket brand, suppose that a random sample of
12 No. 303 cans of string beans at a canning plant. The average weight of the drained beans
in each can is found to be x = 15.97 g, with s = 0.15. The claimed minimum average net
weight of the drained beans per can is 16.0 g. Can this claim be rejected at the 10 percent
level of significance? (Test statistics = -0.693, Do not reject Ho)
4. An automatic soft ice cream dispenser has been set to dispense 4.00 g per serving. For a
sample of n = 10 servings, the average amount of ice cream is x = 4.05 g with standard
deviation = 0.10 g. The amount being dispensed are assumed to be normally distributed.
Basing the null hypothesis on the assumption that the process is ‘in control’, should the
dispenser be reset as a result of a test at the 5 percent level of significance? (Test statistics
= 1.581, Do not reject Ho)
6. A salesman claims that on the average he obtains orders from at least 30 percent of his
prospects. From a random sample of 100 prospects he is able to obtain 20 orders. Can his
claim be rejected at the (a) 5 % and (b) 1 % level of significance? (Test statistics = -2.182,
(a) Reject Ho, (b) Do not reject Ho)
7. The sponsor of a television ‘special’ expected that at least 40 percent of the viewing
audience would watch the show in a particular metropolitan area. For a random sample of
100 households with television sets turned on, 30 are viewing the ‘special’. Can the
sponsor’s assumption that at least 40 percent of the households would watch the program
be rejected at the (a) 10 percent and (b) 5 percent level of significance? (Test statistics = -
2.041, (a) Reject Ho, (b) Reject Ho)
8. The director of a college placement office claims that at least 50 percent of the graduating
seniors had finalized job arrangement by March 1. Suppose a random sample of n = 30
seniors are polled, and only 10 of the students indicate that they have concluded their job
arrangements by March 1. Can the placement director’s claim be rejected at the 5 percent
level of significance? (Test statistics = -1.826, Reject Ho)
9. The average age of passenger cars in use in the US is 8.4 years. For a simple random sample
of 34 vehicles observed in the employee parking area of a large manufacturing plant, the
average age is 9.7 years, with a standard deviation of 3.1 years. At the 0.01 level of
significance, can we conclude that the average age of cars driven to work by the plant’s
employees is greater than the national average?
10. The average length of a flight by regional airlines in the US has been reported as 299 miles.
If a simple random sample of 30 flights by regional airlines were to have x = 314.6 miles
and s = 42.8 miles, would this tend to cast doubt on the reported average of 299 miles? Use
a two-tailed test and the 0.05 level of significance in arriving your answer.
11. A simple random sample of 300 items is selected from a large shipment, and testing reveals
that 4% of the sampled items are defective. The supplier claims that no more than 2% of
the items in the shipment are defective. Carry out appropriate hypothesis test and comment
on the credibility of the supplier’s claim.
12. The International Coffee Association has reported the mean daily coffee consumption for
US residents over the age of 10 as 1.76 cups. Assume that a sample of 38 people in this
age group from a North Carolina city consumed a mean of 1.95 cups of coffee per day with
a standard deviation of 0.85 cups. In a two-tailed test at the 5% level, could the residents
of this city be said to be significantly different from their counterparts across the nation?
13. The director of admissions at a large university says that 15% of high school juniors to
whom she sends university literature eventually apply for admission. In a sample of 300
persons to whom materials were sent, 30 students applied for admission. In a two-tailed
test at the 0.05 level of significance, should we reject the director’s claim?
14. According to the human resources director of a plant, no more than 5% of employees hired
in the past year have violated their preemployment agreement not to use any of five illegal
drugs. The agreement specified that random urine checks could be carried out to ascertain
compliance. In a random sample of 400 employees, screening detected at least one of these
drugs in the systems of 8% of those tested. At the 0.025 level, is the human resources
director’s claim credible?
15. Taxco, a firm specializing in the preparation of income tax returns, claims the mean refund
for customers who received refunds last year was $150. For a random sample of 12
customers who received refunds last year, the mean amount was found to be $125 with a
standard deviation of $43. Assuming that the population is approximately normally
distributed and using the 0.10 level in a two-tailed test, do these results suggest that Taxco’s
assertion may be accurate?
16. The new director of a local YMCA has been told by his predecessors that the average
member has belonged for 8.7 years. Examining a random sample of 15 membership files,
he finds the mean length of membership to be 7.2 years, with a standard deviation of 2.5
years. Assuming the population is approximately normally distributed and using the 0.05
level, does this result suggest that the actual mean length of membership may be some
value other than 8.7 years?
17. In the past, 44% of those taking a public accounting qualifying exam have passed the exam
on their first try. Lately, the availability of exam preparation books and tutoring sessions
may have improved the likelihood of an individual’s passing on his or her first try. In a
sample of 250 recent applications, 130 passed on their first attempt. At the 0.05 level of
significance, can we conclude that the proportion of passing on the first try has increased?
18. A scrap metal dealer claims that the mean of his cash sales is “no more than $80”, but an
Internal Revenue Service agent believes the dealer is untruthful. Observing a sample of 20
cash customers, the agent finds the mean purchase to be $91 with a standard deviation of
$21. Assuming the population is approximately normally distributed and using the 0.05
level of significance, is the agent’s suspicion confirmed?
19. During 2002, college work-study students earned a mean of $1252. Assume that a sample
consisting of 45 of the work-study students at a large university was found to have earned
a mean of $1277 during that year with a standard deviation of $210. Would a one-tailed
test at the 0.05 level suggest the average earnings of this university’s work-study students
were significantly higher than the national mean?
20. It has been claimed that 65% of homeowners would prefer to heat with electricity instead
of gas. A study finds that 60% of 200 homeowners prefer electric heating to gas. Test at
the 0.05 level of significance, can we conclude that the percentage who prefer electric
heating may differ from 65%?
21. Opinion Research has said that 49% of US adults have purchased life insurance. Suppose
that for a random sample of 50 adults from a given US city, a researcher finds that only
38% of them have purchased life insurance. At the 0.05 level, is this sample finding
significantly lower than 49% reported by Opinion Research?
22. According to the New York Stock Exchange, the mean portfolio value for US senior
citizens who are shareholders is $183,000. Suppose a simple random sample of 50 senior
citizen shareholders in a certain region of the US is found to have a mean portfolio value
of $198,700 with a standard deviation of $65,000. From these sample results and using the
0.05 level of significance, comment on whether the mean portfolio value for all senior
citizen shareholders in this region might not be the same as the mean value reported for
their counterparts across the nation.
23. It has been reported that the average life for halogen light bulbs is 4000 hours. Learning of
this figure, a plant manager would like to find out if the vibration and temperature
conditions that the facility’s bulbs encounter might be having an adverse effect on the
service life of bulbs in her plant. In a test involving 15 halogen bulbs installed in various
locations around the plant, she finds the average life for bulbs in the sample is 3882 hours,
with a standard deviation of 200 hours. Assuming the population of halogen bulb lifetimes
to be approximately normally distributed and using the 0.025 level of significance, do the
test results tend to support the manager’s suspicion that adverse conditions might be
detrimental to the operating lifespan of halogen light bulbs used in her plant?
24. According to the National Association of Home Builders, 62% of new single-family homes
built during 1996 had a fireplace. Suppose a nationwide homebuilder has claimed that its
homes are “a cross section of America,” but a simple random sample of 600 of its single-
family homes built during that year included only 57.5% that had a fireplace. Using the
0.05 level of significance, examine whether the percentage of sample homes having a
fireplace could have differed from 62% simple by chance.
25. Before the hiring of an efficiency expert, the mean productivity of a firm’s employees was
45.4 units per hour with a standard deviation of 4.5 units per hour. After incorporating the
changes recommended by the expert, it was found that a sample of 30 workers produced a
mean of 47.5 units per hour. Using the 0.01 level of significance, can we conclude that the
mean productivity has increased?
26. A state transportation official claims that the mean waiting time at exit booths from a toll
road near the capital is no more than 0.40 minutes. For a sample of 35 motorists exiting the
toll road, it was found that the mean waiting time was 0.46 minutes, with a standard
deviation of 0.16 minutes. At the 0.05 level of significance, can we reject the official’s
claim?
27. During 2000, 3.2% of all US households were burglary victims. For a simple random
sample of 300 households from a certain region, suppose that 18 households were
victimized by burglary during that year. Apply an appropriate hypothesis test and the 0.05
level of significance in determining whether the region should be considered as having a
burglary problem greater than that for the nation as a whole?
28. An exterminator claims that no more than 10% of the homes he treats have termite
problems within 1 year after treatment. In a sample of 100 homes, local officials find that
14 had termites less than 1 year after being treated. At the 0.05 level of significance,
evaluate the credibility of the exterminator’s statement.
29. A national chain of health clubs says the mean amount of weight lost by members during
the past month was at least 5 pounds. Skeptical of this claim, a consumer advocate believes
the chain’s assertion is an exaggeration. She interviews a random sample of 40 members,
finding their mean weight loss to be 4.6 pounds with a standard deviation of 1.5 pounds.
At the 0.01 level of significance , evaluate the health club’s contention.
30. The masses of components produced by a certain machine are normally distributed with
mean 16.5g and standard deviation 3.5g. The setting on the machine is altered, following
which a random sample of 49 components is found to have a mean mass of 16.0g. Does
this provide evidence, at the 5% level, of a reduction in the mean mass of components
produced by this machine? Assume that the standard deviation is not altered.
32. A variable with known variance of 36 is thought to have a mean of 63. A random sample
of 100 independent observations of the variable gives a mean of 65. Is there sufficient
evidence that the mean is not 63,
(a) at the 10% level, (b) at the 5% level, (c) at the 1% level?
33. A machine is supposed to produce steel pipes of length 2m. A sample of 10 pipes was taken
and their lengths measured in m. The following results were obtained:
1.98 1.96 1.95 2.00 1.99 1.97 2.01 1.97 1.98 1.96
Assuming that the lengths are normally distributed, test at the 1% level of significance,
whether the machine is in good working order.
34. An athlete finds that his times for running the 100m race follow a normal distribution with
mean 9.89 seconds. He trains intensively for a week and then runs 100m on each of 6
consecutive days. His times (measured in seconds) were 9.79, 10.21, 10.06, 9.93, 10.12,
10.54. Is there evidence, at the 5% level, that the training has improved his times?
35. A test of mental ability has been constructed such that, for adults in Malaysia, the test score
is normally distributed with mean 120 and standard deviation 12. A doctor wishes to test
whether sufferers from a particular disease differ from the general population in their
performance on this test. He chooses a random sample of 10 from his patients. Their scores
on the test are
119 131 95 107 125 123 128 89 103 105
What would you conclude?
36. A theory predicts that the probability of an event is 0.6. The theory is tested experimentally
and in 500 independent trials the event occurred 240 times. Is the number of occurrences
significantly less than that predicted by the theory? Test at the 1% level.
37. It is thought that the proportion of defective items produced by a particular machine is 0.1.
A random sample of 100 items is inspected and found to contain 12 defective items. Does
this provide evidence, at the 5% level, that the machine is producing more defective items
than expected?
38. A government report states that one-quarter of teenagers in Malaysia belong to a youth
organization. A survey conducted among a random sample of 1500 teenagers from a certain
city, revealed that 458 belonged to a youth organization. Does this provide significant
evidence, at the 2% level, that the proportion of teenagers who belong to a youth
organization is greater in this city than the national average?
Based on the results of this sample, calculate a 95% confidence interval for the proportion
of teenagers in this city who belong to a youth organization.
39. A manufacturer claims that his cassettes, advertised as having a playing time of 90 minutes,
actually have a mean playing time of 92 minutes, with standard deviation 1.8 minutes. 36
tapes are selected at random and tested. The investigator rejects the manufacturer’s claim,
at the 5% level, saying that the mean playing time of the tapes is less than 92 minutes. What
can be said about the value of the sample mean obtained for this decision to be taken?
40. ‘Family’ packs of bacon slices are sold in 1.5 kg packs. A sample of 12 packs was selected
at random and the masses, measured in kg, noted. The following results were obtained
∑ 𝑥 = 17.81 and ∑ 𝑥 2 = 26.4357.
Assuming that the masses of the packs follow a normal distribution, with variance σ2 , test
at the 1% level whether the packs are significantly underweight
(a) if σ2 is unknown
(b) if σ2 = 0.0003.
41. A machine packs flour into bags. A random sample of eleven filled bags was taken and the
masses of the bags to the nearest 0.1 g were:
1506.7 1507.2 1506.9 1506.8 1506.6 1506.8
1506.6 1507.0 1507.5 1506.3 1506.4
Obtain the mean and the variance of this sample showing your working clearly. Filled bags
are supposed to have a mass of 1506.5 g. Assuming that the mass of a bag has normal
distribution with variance 0.16 g, test whether the sample provides significant evidence at
the 5% level that the machine produces overweight bags. Give the 99% confidence interval
for the mass of a filled bag.
42. The masses of loaves from a certain bakery are normally distributed with mean 500 g and
standard deviation 20 g.
(a) Determine what percentage of the output would fall below 475 g and what
percentage would be above 530 g.
(b) The bakery produces 1000 loaves daily at a cost of $0.08 per loaf and can sell all
those above 475 g for $0.20 each but is not allowed to sell the rest. Calculate the
expected daily profit.
(c) A sample of 25 loaves yielded a mean mass of 490 g. Does this provide evidence
of a reduced population mean? Use the 5% level of significance and state whether
the test is one-tailed or two.
43. The probability that an oyster larva will develop in unpolluted water is 0.9, while in
polluted water this probability is less than 0.9. Given that 20 oyster larvae are placed in
unpolluted water, find the probabilities, each to two decimal places,that the number that
Foundation in Science Page 6 of 20
Application of Statistics Tutorial 4 Hypothesis Testing
will develop is
(a) at least 17 (b) exactly 17.
An oyster breeder put 20 larvae in a sample of water and observed that only 16 of them
developed. Use a 10% significance level to determine whether the breeder would be
justified in concluding that the water is polluted.
44. A tetrahedral die is thrown 120 times and the number on which it lands is noted.
Number 1 2 3 4 Total
Frequency 35 32 25 28 120
Test, at the 5% level whether the die is fair.
45. From a list of 500 digits, the occurrence of each digit is noted.
Digit 0 1 2 3 4 5 6 7 8 9
Frequency 40 58 49 53 38 56 61 53 60 32
Test, at the 1% level, whether the sequence is a random sample from a uniform distribution.
46. The outcomes, A, B and C, of a certain experiment are thought to occur in the ratio 1:2:1.
The experiment is performed 200 times and the observed frequencies of A, B and C are 36,
115 and 49 respectively. Is the difference in the observed and expected results significant?
Test at the 5% level.
47. According to genetic theory the number of colour strains red, yellow, blue and white in a
certain flower should appear in proportions 4:12:5:4. Observed frequencies of red, yellow,
blue and white strains amongst 800 plants were 110, 410, 150, 130 respectively. Are these
differences from the expected frequencies significant at the 5% level?
If the number of plants had been 1600 and the observed frequencies 220, 820, 300, 260,
would the difference have been significant at the 5% level?
48. It is thought that each of the 8 outcomes of an experiment is equally likely to occur. When
the experiment is performed 400 times, the observed frequencies are 45, 42, 55, 53, 40, 62,
47 and 56. Perform a test at the 1% level to investigate the validity of the theory.
49. In a particular subject students are set multiple choice questions each of which contain 5
alternatives A, B, C, D and E. A teacher suggests that when students do not know the
correct answer they are twice as likely to choose one of B, C or D than to choose A or E.
For 160 questions where it was known that the student answered without knowing the
correct answer, A, B, C, D, E were chosen 23, 45, 36, 43 and 13 times respectively. Is there
evidence, at the 5% level, to support the teacher’s theory?
50. For a given set of data the observed and expected frequencies are shown.
Result 1 2 3 4 5
Observed frequency 30 31 42 40 57
Expected frequency 38 45 36 36 45
Are the differences between the observed and expected frequencies significant at the 1%
level? [χ2𝑐𝑎𝑙𝑐 = 10.68]
51. Suppose that a response can fall into one of 5 categories and 300 responses produced these
category counts:
Category 1 2 3 4 5
Observed Count 47 63 74 51 65
(a) Are the five categories equally likely to occur? How would you test this hypothesis?
(b) If you were to test this hypothesis using chi-square statistic, how many degree of
freedom would the test have?
(c) Find the critical value of 2 that defines the rejection region with α = 0.05.
(d) Calculate the expected value of the test statistic.
(e) Conduct the test and state your conclusions.
52. You surveyed 200 working people who had recently had heart attacks and recorded the day
on which their heart attacks occurred:
53. The table below shows the number of employees absent for a single day during a particular
period of time.
Find the frequencies expected under the hypothesis that the number of absentees is
independent of the day of the week. (E = 100)
Test at the 5% level whether the difference in the observed and expected data are
significant. ( calc
2
10.56 )
54. 300 employees of a company were selected at random and asked whether they were in
favour of a scheme to introduce flexible working hours. The following table shows the
opinion and the departments of the employees.
OPINION
Department In favour Uncertain Against
Production 89 42 9
Sales 53 36 11
Administration 38 12 10
Test whether there is evidence of a significant association between opinion and department.
( calc
2
8.692)
55. A survey was carried out in a firm of the smoking habits of men and women employees
with the following results:
Men Women
Smokers 48 27
Non-smokers 58 57
It is required to test whether, at the 5% level of significance, the survey reveals any
difference in the smoking habits of men and women.
56. A random sample of employees of a large company was selected and the employees were
asked to complete a questionnaire. One question asked whether the employees were in
favour of the introduction of flexible working hours. The following table classifies the
employees by their response and gender i.e. male or female.
RESPONSE Gender
Male Female
In favour 57 83
Not in favour 33 27
Test whether there is evidence of a significant association between the response and
gender.
57. In an investigation into eye colour and left or right handedness the following results were
obtained.
Handedness
Left Right
Eye Colour Blue 15 85
Brown 20 80
Is there evidence, at the 5% level, of an association between eye colour and left or right
handedness? [χ2𝑐𝑎𝑙𝑐 = 0.58, no]
58. An investigation into colourblindness and the sex of a person gave the following results:
Colourblindness
Yes No
Sex Male 36 964
Female 19 981
Is there evidence, at the 5% level, of an association between the sex of a person and whether
or not they are colourblind? [χ2𝑐𝑎𝑙𝑐 = 4.79, Yes]
59. The results obtained by 200 students in chemistry and biology are shown in the table. Test,
at the 5% level, whether the performances in both subjects are related. [χ2𝑐𝑎𝑙𝑐 =
13.3, Related]
Chemistry
Pass Fail
Biology Pass 102 45
Fail 21 32
60. A thousand households are taken at random and divided into three groups A, B and C,
according to the total weekly income. The following table shows the numbers in each group
having a colour television receiver, a black and white receiver or no television at all.
A B C
Colour television 56 51 93
Black and white 118 207 375
None 26 42 32
Calculate the expected frequencies if there is no association between total income and
television ownership.
Apply a test to find whether the observed frequencies suggest that there is such an
association. [χ2𝑐𝑎𝑙𝑐 = 26.6, Yes]
61. The following table shows the numbers of students passed and failed by three examiners
A, B and C.
Examiners
A B C Totals
Pass 51 48 58 157
Fail 4 14 7 25
Total 55 62 65 182
Test the hypothesis that the three examiners fail equal proportions of students by applying
χ2 tests with and without Yates’ correction. Comment on the results. [χ2𝑐𝑎𝑙𝑐 =
6.57 (𝑤𝑖𝑡ℎ𝑜𝑢𝑡 𝑐. 𝑐. ), reject hypothesis, χ2𝑐𝑎𝑙𝑐 = 6.57 (𝑤𝑖𝑡ℎ 𝑐. 𝑐. ), accept hypothesis, ]
62. At St. Trinian’s College for Young Ladies there are 1000 pupils. Of these 75 have
represented the College at both hockey and netball, 10 have represented the College at
hockey but do not play netball, 35 have represented the College at netball but do not play
hockey, and 100 do not play games at all. In all 100 girls have represented the College at
hockey, and 150 at netball. The number who do not play hockey is 200 and the number
who do not play netball is 125.
Foundation in Science Page 10 of 20
Application of Statistics Tutorial 4 Hypothesis Testing
Arrange the above data in the form of a 3 × 3 contingency table, and state how many pupils
play both hockey and netball but have not represented the College either.
Apply the χ2 test to your 3 × 3 table, and state the hypothesis which it tests. [χ2𝑐𝑎𝑙𝑐 =
694, performance in both sports not independent]
63. The following are data on 150 chickens, divided into two groups according to breed, and
into three groups according to yield of eggs:
Yield
High Medium Low
Rhode Island Red 46 29 28
Leghorn 27 14 6
Are these data consistent with the hypothesis that the yield is not affected by the type of
breed? [χ2𝑐𝑎𝑙𝑐 = 4, Yes]
64. The random variable 𝑅 denotes the outcome of a trial. It is claimed that 𝑅 can be modelled
by the probability distribution
6 6
( )( )
𝑃(𝑅 = 𝑟) = { 𝑟 4 − 𝑟 𝑟 = 0, 1, 2, 3,4
495
0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
To investigate this claim, the value of 𝑅 was recorded for a sample of 99 independent trials,
with the following results.
𝑟 0 1 2 3 4
Frequency 2 18 41 32 6
2
Use a 𝜒 goodness of fit test and the 10% level of significance to test the claim that 𝑅 can
be modelled by the probability distribution given above. [χ2𝑐𝑎𝑙𝑐 = 6.6, not supported]
65. A car manufacturer keeps a record of how many of the new cars that it has sold experience
mechanical problems during the first year. The manufacturer also records whether the cars
have petrol engine or a diesel engine. Data for a random sample of 250 cars are shown in
the table.
experience more mechanical problems. Based on your answer to part (a), state, with
a reason, the advice that you would give to Arisa.
66. Gerald is a scientist who studies sand lizards. He believes that sand lizards on islands are,
on average, shorter than those on the mainland. The population of sand lizards on the
mainland has a mean length of 18.2 cm and a standard deviation of 1.8 cm.
Gerald visited three islands, A, B, C and measured the length, 𝑋 centimetres, of each of a
sample of 𝑛 sand lizards on each island. The samples may be regarded as random. The data
are shown in the table:
Island ∑𝒙 𝒏
A 1384.5 78
B 116.9 7
C 394.6 20
(a) Carry out a hypothesis test to investigate whether the data from Island A provide
support for Gerald’s belief at the 2% significance level. Assume that the standard
deviation for the lengths of sand lizards on Island A is 1.8 cm. [Test statistics =
−2.208, reject H0 ]
(b) For Island B, it is also given that
∑(𝑥 − 𝑥̅ )2 = 22.64
(i) Construct a 95% confidence interval for 𝜇𝐵 , where 𝜇𝐵 centimetres is the
mean length of sand lizards on Island B. Assume that the lengths of sand
lizards on Island B are normally distributed with unknown standard
deviation. (14.9, 18.5)
(ii) Comment on whether your confidence interval provides support for
Gerald’s belief.
(c) Comment on whether the data from Island C provide support for Gerald’s belief.
67. Wellgrove village has a main road running through it has a 40 mph speed limit. The
villagers were concerned that many vehicles travelled too fast through the village, and so
they set up a device for measuring the speed of vehicles on this main road. The device
indicated that the mean speed of vehicles travelling through Wellgrove was 44.1 mph.
In an attempt to reduce the mean speed of vehicles travelling through Wellgrove, life-size
photographs of a police officer were erected next to the road on the approaches to the
village. The speed, 𝑋 mph, of a sample of 100 vehicles was then measured and the
following data obtained.
∑ 𝑥 = 4327.0, ∑(𝑥 − 𝑥̅ )2 = 925.71
(a) State an assumption that must be made about the sample in order to carry out a
hypothesis test to investigate whether the desired reduction in mean speed had
occurred.
(b) Given that the assumption that you stated in part (a) is valid, carry out such a test,
using the 1% level of significance. [Test statistics = −2.71, reject H0 ]
(c) Explain, in the context of this question, the meaning of:
(i) A Type I error (ii) a Type II error.
68. In a particular town, a survey was conducted on a sample of 200 residents aged 41 years to
50 years. The survey questioned these residents to discover the age at which they had left
full-time education and the greatest rate of income tax that they were paying at the time of
the survey. The summarised data obtained from the survey are shown in the table.
69. A large survey in the USA establishes that 60% of its residents own a smartphone. A survey
of 250 UK residents reveals that 164 of them own a smartphone. Assuming that these 250
UK residents may be regarded as a random sample, investigate the claim that the
percentage of UK residents owning a smartphone is the same as that in the USA. Use the
5% level of significance. [Test statistics = 1.807, do not reject H0 ]
70. (a) A random sample of 40 residents in a market town reveals that 5 of them own a 4G
mobile phone. Test, at the 5% level of significance, the belief that fewer than 25%
of the town’s residents own a 4G mobile phone. [Test statistics =
−1.826, reject H0 ]
(b) A marketing company needs to estimate the proportion of residents in a large city
who own a 4G mobile phone. It wishes to estimate this proportion to within 0.05
with a confidence of 98%. Given that the proportion is known to be at most 30%,
estimate the sample size necessary in order to meet the company’s need. [455]
71. It is suggested that the difference, 𝐷 minutes, between the time that a patient is actually
seen by an osteopath and the patient’s scheduled appointment time can be modelled by
𝐷 = 5 + 𝑋, where 𝑋 has the following probability density function.
1 2
𝑥 0≤𝑥≤3
18
𝑓(𝑥) = 1
(5 − 𝑥) 3≤𝑥≤5
4
{0 𝑜𝑡ℎ𝑒𝑟𝑤𝑖𝑠𝑒
(a) Complete the table of exact probabilities below:
(b) The results of a random sample of 540 observations of D gave the frequencies
shown in the following table.
72. A large multinational company recruits employees from all four countries in the UK. For
a sample of 250 recruits, the percentages of males and females from each of the countries
are shown in Table 1.
Table 1
England Scotland Wales Northern Ireland
Male 22.8 17.6 10.8 6.8
Female 15.6 17.2 7.6 1.6
(a) Add the frequencies to the contingency table, Table 2, below.
England Scotland Wales Northern Ireland Total
Male
Female
Total
(b) Carry out a 𝜒 2 – test at the 10% significance level to investigate whether there is
an association between country and gender of recuits. [ χ2𝑐𝑎𝑙𝑐 = 6.59, there is
association between country and gender.]
(c) By comparing observed and expected values, make one comment about the
distribution of female recruits.
73. South Riding Alarms (SRA) maintains household burglar-alarm systems. The company
aims to carry out an annual service of a system in a mean time of 20 minutes. Technicians
who carry out an annual service must record the times at which they start and finish the
service.
(a) Gary is employed as a technician by SRA and his manager, Rajul, calculates the
times taken for 8 annual services carried out by Gary. The results, in minutes, are
as follows.
24, 25, 29, 16, 18, 27, 19, 23
Assume that these times may be regarded as a random sample from a normal
distribution. Carry out a hypothesis test, at the 10% significance level, to examine
whether the mean time for an annual service carried out by Gary is 20 minutes.
[Test statistics = 1.626, do not reject H0 ]
(b) Rajul suspects that Gary may be taking longer than 20 minutes on average to carry
out an annual service. Rajul therefore calculates the time taken for 100 annual
services carried out by Gary. Assume that these times may also be regarded as a
random sample from a normal distribution but with a standard deviation of 4.6
minutes. Find the highest value of the sample mean which would not supports
Rajul’s suspicion at the 5% significance level. Give your answer to two decimal
places. [To not support suspicion need 𝑥̅ ≤ 20.75]
74. Coloured plastic clips are sold in packets of 12 clips. It is suggested that the number of blue
clips in a packet can be modelled by a binomial distribution. In order to investigate this
suggestion, 100 packets of clips are randomly chosen. The number of blue clips in each
packet is counted with the following summarised results.
Number of blue clips 0 1 2 3 4 5 6 ≥7
Number of packets 0 6 14 28 27 16 9 0
(a) Show that an estimate of 𝑝, the probability that a randomly chosen clip is blue, is
0.3.
(b) Test, at the 10% level of significance, whether a binomial distribution is an
appropriate model for the number of blue clips in a packet. [ χ2𝑐𝑎𝑙𝑐 = 3.190,
binomial is a suitable model.]
75. A town council wanted residents to apply for grants that were available for home insulation.
In a trial, a random sample of 200 residents was encouraged, either in a letter or by a phone
call, to apply for the grants. The outcomes are shown in the table.
76. A supermarket buys pears from a local supplier. The supermarket requires the mean weight
of the pears to be at least 175 grams. William, the fresh produce manager at the
supermarket, suspects that the latest batch of pears delivered does not meet this
requirement.
(a) William weighs a random sample of 6 pears, obtaining the following weights, in
grams.
160.6, 155.4, 181.3, 176.2, 162.3, 172.8
Previous batches of pears have had weights that could be modelled by a normal
distribution with standard deviation 9.4 grams. Assuming that this still applies,
show that a hypothesis test at the 5% level of significance supports William’s
suspicion. [Test statistics = −1.798, reject H0 ]
(b) William then weighs a random sample of 20 pears. The mean of this sample is 169.4
77. A large estate agency would like all the properties that it handles to be sold within three
months. A manager wants to know whether the type of property affects the time taken to
sell it. The data for a random sample of properties sold are tabulated below.
Type of property Total
Flat Terraced Semi- Detached
detached
Sold within three 4 34 28 18 84
months
Sold in more than three 9 18 8 6 41
months
Total 13 52 36 24 125
(a) Conduct a 𝜒 2 – test, at the 10% level of significance, to determine whether there is
an association between the type of property and the time taken to sell it. Explain
why it is necessary to combine two columns before carrying out this test. [χ2𝑐𝑎𝑙𝑐 =
4.7418, reject 𝐻0 .]
(b) The manager plans to spend extra money on advertising for one type of property in
an attempt to increase the number sold within three months. Explain why the
manager might choose:
(i) terraced properties (ii) flats
78. The times taken to complete a round of golf at Slowpace Golf Club may be modelled by a
random variable with mean m hours and standard deviation 1.1 hours. Julian claims that,
on average, the time taken to complete a round of golf at Slowpace Golf Club is greater
than 4 hours.
The times of 40 randomly selected completed rounds of golf at Slowpace Golf Club result
in a mean of 4.2 hours.
(a) Investigate Julian’s claim at the 5% level of significance. [Test statistics =
1.15, do not reject H0 ]
(b) If the actual mean time taken to complete a round of golf at Slowpace Golf Club is
4.5 hours, determine whether a Type I error, a Type II error or neither was made in
the test conducted in part (a). Give a reason for your answer. [Type II error]
79. Fiona, a lecturer in a school of engineering, believes that there is an association between
the class of degree obtained by her students and the grades that they had achieved in A-
level Mathematics. In order to investigate her belief, she collected the relevant data on the
performances of a random sample of 200 recent graduates who had achieved grades A or
B in A-level Mathematics. These data are tabulated below.
80. (a) A particular bowling club has a large number of members. Their ages may be
modelled by a normal random variable, X, with standard deviation 7.5 years. On
30 June 2010, Ted, the club secretary, concerned about the ageing membership,
selected a random sample of 16 members and calculated their mean age to be 65.0
years.
(i) Carry out a hypothesis test, at the 5% level of significance, to determine
whether the mean age of the club’s members has changed from its value of
61.4 years on 30 June 2000. [Test statistics = 1.92, do not reject H0 ]
(ii) Comment on the likely number of members who were under the age of 25
on 30 June 2010, giving a numerical reason for your answer.
(b) During 2011, in an attempt to encourage greater participation in the sport, the club
ran a recruitment drive. After the recruitment drive, the ages of members of the
bowling club may be modelled by a normal random variable, Y years, with mean
𝜇 and standard deviation 𝜎. The ages, 𝑦 years, of a random sample of 12 such
members are summarised below.
∑ 𝑦 = 702 and ∑(𝑦 − 𝑦̅) = 88.25
(i) Construct a 90% confidence interval for 𝜇, giving the limits to one decimal
place. [CI = (57, 60)]
(ii) Use your confidence interval to state, with a reason, whether the recruitment
drive lowered the average age of the club’s members.
81. (a) Table below contains the observed frequencies 𝑎, 𝑏, 𝑐 and 𝑑, relating to the two
attributes, 𝑋 and 𝑌, required to perform a 𝜒 2 test.
𝑌 Not 𝑌 Total
𝑋 𝑎 𝑏 𝑚
Not 𝑋 𝑐 𝑑 𝑛
Total 𝑝 𝑞 𝑁
(i) Write down, in terms of 𝑚, 𝑛, 𝑝, 𝑞 and 𝑁, expressions for the 4 expected
frequencies corresponding to 𝑎, 𝑏, 𝑐 and 𝑑.
(ii) Hence, prove that the sum of the expected frequencies is 𝑁.
(b) Andy, a tennis player, wishes to investigate the possible effect of wind conditions
on the results of his matches. The results of his matches for the 2011 season are
represented in the following table.
82. Emily believed that the performances of 16-year-old students in their GCSEs are associated
with the schools that they attend. To investigate her belief, Emily collected data on the
GCSE results for 2010 from four schools in her area. The table shows Emily’s collected
data, denoted by 𝑂𝑖 , together with the corresponding expected frequencies, 𝐸𝑖 , necessary
for a 𝜒 2 test.
≥ 5 GCSEs 1 ≤ GCSEs < 5 No GCSEs
𝑂𝑖 𝐸𝑖 𝑂𝑖 𝐸𝑖 𝑂𝑖 𝐸𝑖
Jolliffe College for the Arts 187 193.15 93 90.62 30 26.23
Volpe Science Academy 175 184.43 97 86.52 24 25.05
Radok Music School 183 183.81 78 86.23 34 24.96
Bailey Language School 265 248.61 112 116.63 22 33.76
Emily used these values to correctly conduct a 𝜒 2 test at the 1% level of significance.
(a) State the null hypothesis that Emily used.
(b) Find the value of the test statistic, 𝜒 2 , giving your answer to one decimal place.
[χ2𝑐𝑎𝑙𝑐 = 12.0]
(c) State, in context, the conclusion that Emily should reach based on the results of her
𝜒 2 test. [do not reject 𝐻0 .]
(d) Make one comment on the GCSE performances of 16-year-old students attending
Bailey Language School.
(e) Emily’s friend, Joanna, used the same data to correctly conduct a 𝜒 2 test using the
10% level of significance. State, with justification, the conclusion that Joanna
should reach. [Reject 𝐻0 .]
83. (a) The lifetime of a new 16-watt energy-saving light bulb may be modelled by a
normal random variable with standard deviation 640 hours. A random sample of 25
bulbs, taken by the manufacturer from this distribution, has a mean lifetime of
19700 hours. Carry out a hypothesis test, at the 1% level of significance, to
determine whether the mean lifetime has changed from 20 000 hours.
[Test statistics = −2.344, do not reject H0 ]
(b) The lifetime of a new 11-watt energy-saving light bulb may be modelled by a
normal random variable with mean 𝜇 hours and standard deviation 𝜎 hours. The
manufacturer claims that the mean lifetime of these energy-saving bulbs is 10000
hours. Christine, from a consumer organisation, believes that this is an
overestimate. To investigate her belief, she carries out a hypothesis test at the 5%
level of significance based on the null hypothesis 𝐻0 : 𝜇 = 10000.
(i) State the alternative hypothesis that should be used by Christine in this test.
(ii) From the lifetimes of a random sample of 16 bulbs, Christine finds that 𝑠 =
500 hours. Determine the range of values for the sample mean which would
lead Christine not to reject her null hypothesis. [CI =
(9780.9, 10219.1), 𝑥̅ > 9780]
(iii) It was later revealed that 𝜇 = 10000. State which type of error, if any, was
made by Christine if she concluded that her null hypothesis should not be
rejected. [No error]
84. A consumer report claimed that more than 25 per cent of visitors to a theme park were
dissatisfied with the catering facilities provided. In a survey, 375 visitors who had used the
catering facilities were interviewed independently, and 108 of them stated that they were
dissatisfied with the catering facilities provided.
(a) Test, at the 2% level of significance, the consumer report’s claim.
[Test statistics = 1.70, do not reject H0 ]
(b) State an assumption about the 375 visitors that was necessary in order for the
hypothesis test in part (a) to be valid.
85. In 2001, the mean height of students at the end of their final year at Bright Hope Secondary
School was 165 centimetres. In 2010, David and James selected a random sample of 100
students who were at the end of their final year at this school. They recorded these students’
heights, 𝑥 centimetres, and found that 𝑥̅ = 167.1 and 𝑠 2 = 101.2.
To investigate the claim that the mean height had increased since 2001, David and James
each correctly conducted a hypothesis test. They used the same null hypothesis and the
same alternative hypothesis. However, David used a 5% level of significance whilst James
used a 1% level of significance.
(a) (i) Write down the null and alternative hypotheses that both David and James
used.
(ii) Determine the outcome of each of the two hypothesis tests, giving each
conclusion in context.
[Test statistics = 2.09, David: reject H0 and James: do not reject H0 ]
(iii) State why both David and James made use of the Central Limit Theorem in
their hypothesis tests.
(b) It was later found that, in 2010, the mean height of students at the end of their final
year at Bright Hope Secondary School was actually 165 centimetres. Giving a
reason for your answer in each case, determine whether a Type I error or a Type II
error or neither was made in the hypothesis test conducted by:
(i) David; [Type I error] (ii) James. [No error]
86. Judith, the village postmistress, believes that, since moving the post office counter into the
local pharmacy, the mean daily number of customers that she serves has increased from
79. In order to investigate her belief, she counts the number of customers that she serves
on 12 randomly selected days, with the following results.
88 81 84 89 90 77 72 80 82 81 75 85
Stating a necessary distributional assumption, test Judith’s belief at the 5% level of
significance. [Test statistics = 1.86, Reject H0 ]
Foundation in Science Page 19 of 20
Application of Statistics Tutorial 4 Hypothesis Testing
87. It is claimed that a new drug is effective in the prevention of sickness in holiday-makers.
A sample of 100 holiday-makers was surveyed, with the following results.
Sickness No sickness Total
Drug taken 24 56 80
No drug taken 11 9 20
Total 35 65 100
Assuming that the 100 holiday-makers are a random sample, use a 𝜒 2 test, at the 5% level
of significance, to investigate the claim. [χ2𝑐𝑎𝑙𝑐 = 3.3654, do not reject 𝐻0 .]