Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

small-samples_full_F-test_T43_ERP

Download as pdf or txt
Download as pdf or txt
You are on page 1of 13

Chapter 6: Tests of Significance for Small Samples

Tests of Significance
For
Small Samples
6.1 Introduction
For small samples (size < 30); tests proposed for large samples do not hold good as
sampling distribution cannot be assumed to be normal for small samples. This led to
search of new approaches to deal with small samples. It should be made rational that the
methods and theory applicable to small samples can be used for large samples; but the
converse is not true
After hypothesis formulation; choice of test may be sometimes baffling unless specified
which test to use. Following suggestions should be kept in mind while choosing test of
significance for any hypothesis.

 Size of sample: If size of sample is greater than thirty, use any of the applicable
large sample tests or Chi-Square test depending upon the applicability.
 Variance of population: If the population variance is known and the underlying
distribution is normal, -test should be used. Also with known variance; if the
distribution is not normal, yet for large sample size, z test can be used. But in case
of unknown variance,
-test should be used.
Degree of freedom ( ): The degrees of freedom are the number of independent
quantities that can be assigned to any statistical distribution arbitrarily. Suppose we are
required to choose 5 numbers whose sum is 30, then we have choice of only 4 numbers.
This suggests that a data of size has ( ) degrees of freedom in general. In case of
two restrictions, degrees of
freedom will be ( ).

6.2 Student’s t-Distribution


Gosset; who wrote under the pen-
name ‘Student’, derived a
theoretical distribution known as
Student’s distribution; which is
used to test a hypothesis when the
sample size is small and the
population variance is not known. As (size of the sample) increases; distribution tends
to normal distribution. It is evident from the graph below that t-distribution has
proportionally larger area at its tails than the normal distribution.
If , , , be a random sample of size , from a normal distribution with
mean and variance (not known); also be the sample mean and standard
deviation of the sample

then , where ,

Degree of freedom is given by


Statistics is used to test the significance of:
1. Sample mean
2. Difference between two sample means
3. Sample coefficient of correlation
4. Sample Regression Coefficient
Result Testing significance of sample mean
Set up the hypothesis : sample is drawn from the given population.
To calculate significance of sample mean at 5% level, calculate the statistics

and compare with the table value of at degrees of freedom.

Let the tabulated value of be , if ; , is accepted, otherwise value of is


considered significant and we reject the hypothesis .
Fiducial (Confidence) limits of population mean
At 5% significance level (95% confidence level):

, where is table value of at given degrees of freedom

At 1% significance level (99% confidence level):

, where is table value of at given degrees of freedom.

Example1 The mean life time of sample of 16 fluorescent light bulbs produced by a
company is computed to be 1550 hours with a standard deviation of 100 hours. The
company claims that the average life of the each bulb is 1600 hours. Using the level of
significance of 0.05, is the claim accepted? Also find the confidence limits for .
Solution: Let : Average life of the each bulb is 1600 hours.

Population mean ( ) = 1600 hours, sample size ( ) = 16


Sample mean , sample standard deviation ( ) = 100 hours

Statistic

Table value of at 5% level of significance for 15 degrees of freedom is 2.13.


Calculated value of table value of , hence the hypothesis ( ) is accepted that
the average life of the each bulb is 1600 hours.

Also confidence limits for

1496.75 to 1603.25
Example2 Ten individuals are chosen at random from a population and their heights are
found in inches as: 63, 63, 64, 65, 66, 69, 69, 70, 70, and 71. Discuss the suggestion that
the mean height of universe is 65. Value of at 5% level of significance for 9 degrees of
freedom is 2.262.
Solution: Let : The sample is drawn from the given population.

Population mean ( ) = 65, sample size ( ) =10


Calculating sample mean and standard deviation:

,
63 -4 16
63 -4 16
64 -3 9
65 -2 4
66 -1 1
69 +2 4
69 +2 4
70 +3 9
70 +3 9
71 +4 16

= = 67

Sample standard deviation:


Statistic

Calculated value of is less than tabulated value at 9 degrees of freedom.


Hence the hypothesis ( ) is accepted that sample is drawn from the given population i.e.
mean height of the universe is 65 inches.

Result Testing difference between means of two small samples


Let two independent samples ( , , , );( , , , ) of sizes ; with
means

; and standard deviations ; be drawn from two


normal populations with means and having same variance , then to test
whether the two population means are same:

Calculate ; where or

and compare with table value of (say ) at ( ) degrees of freedom.


Hypothesis is accepted if calculated value of .
Example3 Two independent samples showing weights in ounces of eight and seven items
are given below:
Sample I: 10 12 13 11 15 9 12 14
Sample II: 9 11 10 13 9 8 10
Is the difference between the means of the samples significant?
Value of at 5% level of significance for 15 degrees of freedom is 2.16.
Solution: Let the null hypothesis be:
Here and , calculating sample means and standard deviations

( ) ( )
10 -2 4 9 -1 1
12 0 0 11 1 1
13 1 1 10 0 0
11 -1 1 13 3 9
15 3 9 9 -1 1
9 -3 9 8 -2 4
12 0 0 10 0 0
14 2 4

28 16
= = 12 = = 10

Table value of at 5% level of significance for 13 degrees of freedom is 2.16


Calculated value of table value of
Hence the hypothesis ( ) is accepted that , i.e. difference between the means of
the two samples is not significant.
Example4 Two types of batteries were tested for their mean life length and following
results are obtained. Is there a significant difference in the two batteries?

Sample size Mean life Variance


Battery A 10 500 hours 100
Battery B 8 540 hours 81

Solution: Let the null hypothesis be: i.e. there is no significant


difference in the two batteries. Here and , and ,
,

Table value of at 5% level of significance for 16 degrees of freedom is 2.12.


Calculated value of table value of , hence the hypothesis ( ) is rejected
that , i.e. difference between two batteries is highly significant.
Example5: A set of 15 observations has mean 68.57; standard deviation 2.4, another set
of 7 observations gives mean as 64.14; standard deviation 2.7.
Use -test to find whether two sets of data are drawn from populations with same mean, if
standard deviations of two populations are assumed to be equal.
Solution: Let the null hypothesis be:
i.e. there is no significant difference between the two population means.
Here and , and
,

Table value of at 5% level of significance for 20 degrees of freedom is 2.086


Calculated value of table value of , hence the hypothesis ( ) is rejected
that , i.e. difference between two means is significant.
Result Testing difference between two dependent samples or paired
observations
A ‘ ’ test can be efficiently used to compare two samples of the same population for
some treatment effects; for instance to compare efficacy of two drugs on a population or
to study the effect of coaching on some students etc.

Compute the statistic ; where denotes targeted value and is zero for testing

equal means and , where is the deviation from the mean difference.

Example6 A dietitian opts to try out a new type of diet program on ten overweight girls
for 2 months. He targets to make them loose 6 kgs on average, and records their weights
before and after the diet program. Use 0.05 significance level to test whether this special
diet program helped or not.

Before weights 65 77 99 86 84 93 59 72 69 103


After weights 63 72 92 80 80 87 57 67 64 95
Table value of at 5% level of significance for 9 degrees of freedom is 2.26.
Solution: Let : Average weight loss caused by diet program is 6 kg.
Calculating deviations from the mean weight loss:
Girls Weight difference ( ) =
1 -2 +3 9
2 -5 0 0
3 -7 -2 4
4 -6 -1 1
5 -4 +1 1
6 -6 -1 1
7 -2 +3 9
8 -5 0 0
9 -5 0 0
10 -8 -3 9
, =

Standard deviation of the differences

Statistic , where targeted weight loss ( ) kg

Calculated value of table value of


Hence the hypothesis ( ) is accepted that the average weight loss caused by diet
program is 6 kg.
Example7 Two laboratories carried out independent estimates of lead content (in mg) in
noodles of a certain brand. A sample is taken from each batch, halved and the separate
halves were tested in the two laboratories to obtain the following results:
Batch Number 1 2 3 4 5 6 7 8 9 10
Lab A 9 8 8 4 7 7 9 6 6 6
Lab B 7 8 7 3 8 6 9 4 7 8
Does the testing suggest same average lead content in the brand?
Solution: : Average lead content of two samples are equal.
Calculating deviations from the mean difference:

Batch Difference ( ) =
1 2 1.7 2.89
2 0 -0.3 0.09
3 1 0.7 0.49
4 1 0.7 0.49
5 -1 -1.3 1.69
6 1 0.7 0.49
7 0 -0.3 0.09
8 2 1.7 2.89
9 -1 -1.3 1.69
10 -2 -2.3 5.29
, =

Standard deviation of the differences

as we are testing for equal means of two lab


tests

Table value of at 5% level of significance for 9 degrees of freedom is 2.26.


Calculated value of table value of at 9 degrees of freedom at 5% level of
significance, hence the hypothesis ( ) is accepted that the average lead content of two
samples are equal.
Result Testing significance of population correlation coefficient from sample
coefficient of correlation
Let , , , having coefficient of correlation , be a random sample
from a bivariate frequency distribution having individual means ; and standard
deviations ; . Then to test the hypothesis that population correlation coefficient ( ) is
zero: Compute the statistic , where is the sample size

Hypothesis is accepted if calculated value of is less than tabulated value of at


degrees of freedom for the specified significance level.
Example8 A random sample of size 18 from a bivariate population gave correlation
coefficient . Does this indicate the existence of correlation in the population?
Solution: Let : Population correlation coefficient ( ) is zero, i.e. there is no correlation
between the population variables.

Table value of at 16 degrees of freedom is 2.12


Calculated value of is less than tabulated value of at 16 degrees of freedom,
hence the hypothesis is accepted that there is no correlation between the population
variables.
6.3 Snedecor’s F – test for Testing Equality of Two Population Variances
An -test is used to test if the variances of two populations are equal. It can be a one-
tailed or two-tailed test. The one-tailed version tests in only one direction, i.e. variance of
the first population is either greater or less than the second population but not both ways.
The two-tailed version tests for the hypothesis that the variances are not equal but one
can be greater or less than the other.

Let two independent random samples ( , , , );( , , , ) having standard

deviations ; be drawn from two normal populations.

Snedecor defined the statistic , for testing equality of two population


variances. Here greater of the two variances and is to be taken in the numerator and
if corresponds to the greater variance, then degree of freedom is ( , .
If calculated value of is less than the table value of with ( , at
given level of significance, the null hypothesis that ‘the two samples might have been
drawn from two normal population with the same variance’ is accepted.
Example9 Two samples of size 9 and 8 give the sum of squares of deviations from their
respective means equal to 160 inches square and 91 inches square respectively. Can
they be regarded as drawn from two normal populations with same variances?
(Value of for 8 and 7 degrees of freedom is 3.73)
Solution: Let : Two samples have been drawn from two normal populations with same
variance.

Given ,

Table value of for (8, 7) degrees of freedom is 3.73, Calculated value of F is


much less than the table value of at (8,7) degrees of freedom. Hence is
accepted, i.e. two samples may be regarded as drawn from two normal populations
with same variances.
Example10 Show how we can use Student’s- test and Snedecor’s test to decide
whether the following two samples have been drawn from the same normal
population. Which of the two tests would you apply first and why?
Size Mean Sum of squares of
deviations from the mean
Sample I 9 68 36

Sample II 10 69 42

Given that ,
Solution: : Two samples have been drawn from the same normal population
To test using Student’s- test, population variance should be same, we can test
that two samples have been drawn from two normal populations with same variance
using Snedecor’s test. Snedecor’s test should be applied first.
For the two samples I and II say ( , , , );( , , , )

Table value of for (9, 8) degrees of freedom is 3.39


Calculated value of is much less than the table value of at (9, 8) degrees of
freedom.
Hence the two samples may be regarded as drawn from two normal populations
with same variances.
Again to test whether the two population means are same using -test

Calculate ; where

Table value of at 5% level of significance for 17 degrees of freedom is 2.11.


Calculated value of table value of , hence the hypothesis that two population
means are same is accepted.
Thus using Snedecor’s test, we can say that the two samples have been drawn from
two normal populations with same variance and also the two population means are same
using -test, hence we can conclude that the hypothesis that the two samples have
been drawn from the same normal population may be accepted.
6.4 Fisher’s Z Test for Testing Significance of Correlation Coefficient for Small
Samples
If is the correlation co-efficient of a sample and be the population co-efficient of
correlation, then to test the hypothesis that the given sample has been drawn from the
population whose coefficient of correlation is :

Compute the statistic , where and

If value of , hypothesis is accepted at 5% level of significance.

Example11 Test the significance of the correlation for a sample of size 20


against hypothetical population coefficient of correlation
Solution: Let : correlation coefficient of population is

Here

Now

Hence the sample may be regarded as coming from population with coefficient of
correlation

Exercise 6

1 A factory makes a machine part with axle diameter of 0.7 inch. A random sample
of 10 parts shows a mean diameter of 0.742 inch with a standard deviation of 0.04
inch. On the basis of this sample would you say that the work is inferior?
Value of at 5% level of significance for 9 degrees of freedom is 2.262.
2 A random sample of 10 boys had the I.Q. levels: 70, 120, 110, 101, 88, 83, 95, 98,
107 and 100. Does this data support the assumption of population mean I.Q. of 100
at 5% level of significance?
3 In a school the heights of six randomly chosen girls are: 63, 65, 68, 69, 71 and 72
inches and those of nine randomly chosen boys are 61, 62, 65, 66, 69, 70, 71, 72
and 73 inches. Discuss the hypothesis that the girls are taller than boys.
Value of at 5% level of significance for 13 degrees of freedom is 1.77.
4 A random sample of size 16 has mean 53. The sum of squares of deviations from
the mean is 135. Can the sample be regarded as taken from a population having
mean as 56?
5 A new medicine is given to 12 patients whose B.P. increases by 5, 2, 8, -1, 3, 0, -2,
1, 5, 0, 4, 6 units. Can we conclude that the medicine results in increased blood
pressure?
6 Two samples of different brands were tested for average life; a sample from first
brand of size 7 shows a mean life of 1036 hours with a standard deviation of 40
hours and a sample of size 8 shows a mean life of 1234 hours with a standard
deviation of 36 hours. Is the difference in the two sample means significant to
conclude that the second brand has more life than first brand?
7 A researcher hypothesizes that people who are allowed to sleep for only four hours
will score significantly lower in an objective skills test than people who are allowed
to sleep for eight hours. He selects sixteen participants and randomly assigns them
to one of two groups. In one group he makes participants sleep for eight hours and
in the other group he allows them to sleep only for four hours. The next morning he
administers the skill test to all participants. Scores range from 1-9 with high scores
representing better performance.

Test scores
8 hours sleep group 5 7 5 3 5 3 3 9
4 hours sleep group 8 1 4 6 6 4 1 2
Test the hypothesis, given that
8 A group of 10 rats fed on the diet and another group of 8 rats fed on the diet
recorded the following increase in weights in a week:
Weight gains (grams)
Diet 5 6 8 1 12 4 3 9 6 10
Diet 2 3 6 8 10 1 2 8 - -
Does it show superiority of Diet over that of Diet ?
9 Test runs with 6 models of an experimental engine showed that they operated for
24, 28, 21, 23, 32 and 22 minutes with a gallon of fuel. If the probability of a Type I
error is at most 0.01, is this an evidence against the hypothesis that on average this
kind of engine will operate for at least 27 minutes per gallon on the same fuel?
10 Test whether the following two samples have been drawn from the same normal
population.
Sum of squares of
Size Mean
deviations from the mean
Sample I 10 15 90
Sample II 12 14 108
Given that ,

Answers
1. which is greater than the table value at 5% significance level, work can
be considered to be inferior.
2. given data supports the mean I.Q. as 100.
3. there is no significant difference between the sample means
4. No
5. No
6. significantly different to conclude that the second brand has
more life than first brand
7. there is no significant difference between the performances of two
groups.
8. No
9. No
10. Yes, the two samples can be considered to be drawn from the same normal
population.

You might also like