PSCV Unit-Iii Digital Notes

Confidence limits for the difference of two population proportions
 95% confidence limits are 𝑝1 -𝑝2 ± 1.96 ( S.E. of 𝑝1 -𝑝2 )

 99.73% confidence limits are 𝑝1 -𝑝2 ± 3 ( S.E. of 𝑝1 -𝑝2 )
Determination of proper sample size

Sample size for estimating population mean :
𝑧𝛼 𝜎 2
n= ( ) where 𝑧𝛼 – Critical value of z at 𝛼 Level of significance
𝐸
𝜎 − Standard deviation of population and
E – Maximum sampling Error = 𝑥̅ – 𝜇
Sample size for estimating population proportion :

𝑧𝛼 2 𝑃𝑄
𝑛= where 𝑧𝛼 – Critical value of z at 𝛼 Level of significance
𝐸2
P − Population proportion
𝑄 − 1-P
𝐸 − Maximum Sampling error = p-P
Testing of Hypothesis :
It is an assumption or supposition and the decision making procedure about the assumption
whether to accept or reject is called hypothesis testing .
Def: Statistical Hypothesis : To arrive at decision about the population on the basis of
sample information we make assumptions about the population parameters involved such
assumption is called a statistical hypothesis .
Procedure for testing a hypothesis:
Test of Hypothesis involves the following steps:
Step1: Statement of hypothesis :

There are two types of hypothesis :
 Null hypothesis: A definite statement about the population parameter. Usually a null
hypothesis is written as no difference , denoted by 𝐻0 .
Ex. 𝐻0 : 𝜇 = 𝜇0
 Alternative hypothesis : A statement which contradicts the null hypothesis is called
alternative hypothesis. Usually an alternative hypothesis is written as some difference
, denoted by 𝐻1 .
DEPARTMENT OF HUMANITIES & SCIENCES

Setting of alternative hypothesis is very important to decide whether it is two-tailed or
one – tailed alternative , which depends upon the question it is dealing.
Ex.𝐻1 : 𝜇 ≠ 𝜇0 (Two – Tailed test)
or
𝐻1 : 𝜇 > 𝜇0 (Right one tailed test)
or
𝐻1 : 𝜇 < 𝜇0 (Left one tailed test)
Step 2: Specification of level of significance :

The LOS denoted by 𝛼 is the confidence with which we reject or accept the null
hypothesis. It is generally specified before a test procedure ,which can be either 5%
(0.05) , 1% or 10% which means that thee are about 5 chances in 100 that we would
reject the null hypothesis 𝐻0 and the remaining 95% confident that we would accept
the null hypothesis 𝐻0 . Similarly , it is applicable for different level of significance.
Step 3 : Identification of the test Statistic :

There are several tests of significance like z,t, F etc .Depending upon the nature of the
information given in the problem we have to select the right test and construct the test
criterion and appropriate probability distribution.
Step 4: Critical Region:

It is the distribution of the statistic .
 Two – Tailed Test : The critical region under the curve is equally distributed on
both sides of the mean.
If 𝐻1 has ≠ sign , the critical region is divided equally on both sides of the
distribution.

 One Tailed Test: The critical region under the curve is distributed on one side of
the mean.
Left one tailed test: If 𝐻1 has < sign , the critical region is taken in the left side of the
distribution.
Right one tailed test : If 𝐻1 has > sign , the critical region is taken on right side of the
distribution.
Step 5 : Making decision:

By comparing the computed value and the critical value decision is taken for accepting or
rejecting 𝐻0
If calculated value ≤ critical value , we accept 𝐻0 , otherwise reject 𝐻0 .
Errors of Sampling :
While drawing conclusions for population parameters on the basis of the sample results , we
have two types of errors.
 Type I error : Reject 𝐻0 when it is true i.e, if the null hypothesis 𝐻0 is true but it is
rejected by test procedure .
 Type II error : Accept 𝐻0 when it is false i.e, if the null hypothesis 𝐻0 is false but it is
accepted by test procedure.

DECISION TABLE
𝑯𝟎 is accepted 𝑯𝟎 is rejected
𝑯𝟎 is true Correct Decision Type I Error
𝑯𝟎 is false Type II Error Correct Decision
Problems:
1.If the population is 3,6,9,15,27
a) List all possible samples of size 3 that can be taken without replacement
from finite population
b) Calculate the mean of each of the sampling distribution of means
c) Find the standard deviation of sampling distribution of means
3+6+9+15+27 60
Sol: Mean of the population , 𝜇 = = =12
5 5
Standard deviation of the population ,
(3 − 12)2 + (6 − 12)2 + (9 − 12)2 + (15 − 12)2 + (27 − 12)2

𝜎= √
5
81+36+9+9+225 360
=√ =√ = 8.4853
5 5
a) Sampling without replacement :

The total number of samples without replacement is 𝑁𝐶𝑛 = 5𝐶3 =10
The 10 samples are (3,6,9), (3,6,15), (3,9,15), (3,6,27), (3,9,27), (3,15,27),
(6,9,15), (6,9,27), (6,15,27), (9,15,27)
b) Mean of the sampling distribution of means is
6+8+9+10+12+13+14+15+16+17 120
𝜇𝑥̅ = = = 12
10 10
c) 𝜎 2 =
(6−12) 2+(8−12) 2 +(9−12) 2 +(10−12) 2 +(12−12) 2+(13−12)2 +(14−12) 2 +(15−12) 2 +(16−12)2 +(17−12) 2
10
= 13.3
∴ 𝜎𝑥̅ = √13.3 = 3.651
2.A population consist of five numbers 2,3,6,8 and 11. Consider all possible samples of
size two which can be drawn with replacement from this population .Find
a) The mean of the population

b) The standard deviation of the population
c) The mean of the sampling distribution of means and
d) The standard deviation of the sampling distribution of means
Sol: a) Mean of the Population is given by

2+3+6+8+11 30
𝜇 = = =6
5 5
b) Variance of the population is given by

(𝑥𝑖 −𝑥̅ )2
𝜎2 = ∑ 𝑛
(2−6)2 +(3−6) 2+(6−6)2 +(8−6)2 +(11−6)2

= 5
16+9+0+4+25
= = 10.8 ∴ 𝜎 = 3.29
5
c) Sampling with replacement

The total no.of samples with replacement is 𝑁 𝑛 = 52 = 25
∴ List of all possible samples with replacement are
(2,2), (2,3), (2,6), (2,8), (2,11), (3,2), (3,3)(3,6), (3,8), (3,11)
{(6,2), (6,3), (6,6), (6,8), (6,11), (8,2), (8,3), (8,6), (8,8), (8,11)}
(11,2), (11,3), (11,6), (11,8), (11,11)
Now compute the arithmetic mean for each of these 25 samples which gives rise to
the distribution of means of the samples known as sampling distribution of means
The samples means are
2 , 2.5 , 4 , 5 ,6.5
2.5 , 3 ,4.5, ,5.5,7
4,4.5,6,7,8.5
5,5.5,7,8,9.5
{ 6.5,7,8.5,9.5,11 }
And the mean of sampling distribution of means is the mean of these 25 means
sum of all above sample means 150
𝜇𝑥̅ = = =6
25 25
d) The variance of the sampling distribution of means is obtained by subtracting the
mean 6 from each number in sampling distribution of means and squaring the result
,adding all 25 numbers thus obtained and dividing by 25.
(2−6)2 +(2.5−6) 2+(4−6)2 +(5−6)2 +⋯……(11−6)2 135
𝜎2 = = = 5.4
25 25
∴ 𝜎 = √5.4 = 2.32
3.When a sample is taken from an infinite population , what happens to the standard
error of the mean if the sample size is decreased from 800 to 200
𝜎
Sol: The standard error of mean =
√𝑛
Sample size = n .let n= 𝑛1 =800

𝜎 𝜎
Then S.E1 = = 20
√800 √2
When 𝑛1 is reduced to 200
let n= 𝑛2 =200
𝜎 𝜎
Then S.E2 = = 10
√200 √2

𝜎 𝜎
∴ S.E2 = 10 = 2(20 2) = 2 (S.𝐸1 )
√ 2 √
Hence if sample size is reduced from 800to 200, S. E. of mean will be multiplied by 2
4.The variance of a population is 2 . The size of the sample collected from the
population is 169. What is the standard error of mean
Sol: n= The size of the sample =169
𝜎 = S.D of population = √Variance = √2
𝜎 √2 1.41
Standard Error of mean = = = = 0.185
√ 𝑛 √169 13
5.The mean height of students in a college is 155cms and standard deviation is 15 . What
is the probability that the mean height of 36 students is less than 157 cms.
Sol: 𝜇 = Mean of the population
= Mean height of students of a college = 155cms
n = S.D of population = 15cms
̅𝑥 = mean of sample = 157 cms

𝑥̅ −𝜇 157−155 12
Now z = 𝜎 = 15 = 15 = 0.8
√𝑛 √36
∴ P ( 𝑥̅ ≤ 157) = P ( z < 0.8 ) = 0.5 + P ( 0 ≤ z ≤ 0.8 )
= 0.5 +0.2881 = 0.7881
Thus the probability that the mean height of 36 students is less than 157 = 0.7881
6.A random sample of size 100 is taken from a population with 𝝈 = 5.1 . Given that the
̅ = 21.6 Construct a 95% confidence limits for the population mean .
sample mean is 𝒙
Sol: Given 𝑥̅ = 21.6
𝑧𝛼⁄2 = 1.96, n = 100 , 𝜎 = 5.1

𝜎 𝜎
∴ Confidence interval = ( 𝑥̅ − 𝑧𝛼⁄2 . , 𝑥̅ + 𝑧𝛼⁄2 . )
√ 𝑛 √𝑛
𝜎 1.96 x 5.1
𝑥̅ − 𝑧𝛼⁄2 . = 21.6 – = 20.6
√𝑛 10
𝜎 1.96 x 5.1
𝑥̅ + 𝑧𝛼⁄2 . = 21.6 + = 22.6
√𝑛 10
Hence (20.6,22.6) is the confidence interval for the population mean 𝜇
7.It is desired to estimate the mean time of continuous use until an answering machine
will first require service . If it can be assumed that 𝝈 = 60 days, how large a sample is

needed so that one will be able to assert with 90% confidence that the sample mean is
off by at most 10 days.
Sol: We have maximum error (E) = 10 days , 𝜎 = 60 days and 𝑧𝛼⁄2 = 1.645
𝑧𝛼⁄ .𝜎 2 1.645 x 60 2
2
∴n=[ ] =[ ] = 97
𝐸 10
8.A random sample of size 64 is taken from a normal population with 𝝁 = 𝟓𝟏. 𝟒 and 𝝈 =
6.8.What is the probability that the mean of the sample will a) exceed 52.9 b) fall
between 50.5 and 52.3 c) be less than 50.6
Sol: Given n = the size of the sample = 64
𝜇 = the mean of the population = 51.4
𝜎 = the S.D of the population = 6.8
a) P( 𝑥̅ exceed 52.9 ) = P(𝑥̅ > 52.9)

𝑥̅ −𝜇 52.9−51.4
Z= 𝜎 = 6.8 = 1.76
√𝑛 √64
∴ P(̅̅̅
𝑥 > 52.9 ) = P(z > 1.76)
= 0.5 – P(0 < z < 1.76)

= 0.5 – 0.4608 = 0.0392
b) P( 𝑥̅ fall between 50.5 and 52.3)

i.e, P(50.5 < 𝑥̅ < 52.3) = P(𝑥
̅̅̅1̅ < 𝑥̅ < ̅̅
𝑥̅̅)
2
𝑥
̅̅̅1̅−𝜇 50.5−51.4
𝑧1 = 𝜎 = = −1.06
0.85
√𝑛
𝑥
̅̅̅2̅−𝜇 52.3−51.4
𝑧2 = 𝜎 = = 1.06
0.85
√𝑛
P(50.5 < 𝑥̅ < 52.3) = P(-1.06 < z < 1.06)
= P(-1.06 < z < 0) + P(0 < z < 1.06)
= P(0 < z < 1.06) + P(0 < z < 1.06)
= 2( 0.3554) = 0.7108
c) P( 𝑥̅ will be less than 50.6) = P(𝑥̅ < 50.6)

𝑥̅ −𝜇 50.6−51.4
Z= 𝜎 = 6.8 = -0.94
√𝑛 √64
∴P(z < -0.94) = 0.5 - P(0.94 < z < 0)
= 0.5 - P(0 < z < 0.94) = 0.50-0.3264
= 0.1736

9.The mean of certain normal population is equal to the standard error of the mean of
the samples of 64 from that distribution . Find the probability that the mean of the
sample size 36 will be negative.
𝜎
Sol: The Standard error of mean =
√𝑛
Sample size , n =64

Given mean , 𝜇 = Standard error of the mean of the samples
𝜎 𝜎
𝜇 = 64 = 8
√
𝜎
𝑥̅ −𝜇 𝑥̅ −
8
We know z = 𝜎 = 𝜎
√𝑛 6
6𝑥̅ 3
= -
𝜎 4
If Z < 0.75, ̅𝑥 is negative
P(z < 0.75) = P( − ∞ < 𝑧 < 0.75 )
0 0.75
= ∫− ∞ ∅(𝑧) dz + ∫0 ∅(𝑧)dz = 0.50 + 0.2734
= 0.7734
10.The guaranteed average life of a certain type of electric bulbs is 1500hrs with a S.D
of 10 hrs. It is decided to sample the output so as to ensure that 95% of bulbs do not fall
short of the guaranteed average by more than 2% . What will be the minimum sample
size ?
Sol : Let n be the size of the sample
The guaranteed mean is 1500

We do not want the mean of the sample to be less than 2% of (1500 ) i.e, 30 hrs
So 1500 – 30 = 1470
∴ 𝑥̅ > 1470
𝑥̅ − 𝜇 1470−1500 𝑛
∴ |𝑧 | = | 𝜎 | =| 120 |=√
4
√𝑛 √𝑛
From the given condition , the area of the probability normal curve to the left of
√𝑛
4
should be 0.95
√n
∴ The area between 0 and is 0.45
4
We do not want to know about the bulbs which have life above the guranteed life .
√𝑛
∴ = 1.65 i.e., √𝑛 = 6.6
4
∴ n = 44

11.A normal population has a mean of 0.1 and standard deviation of 2.1 . Find the
probability that mean of a sample of size 900 will be negative .
Sol : Given 𝜇 = 0.1 , 𝜎 = 2.1 and n = 900
The Standard normal variate

𝑥̅ − 𝜇 𝑥̅ − 𝜇 𝑥̅ − 0.1
Z= 𝜎 = 2.1 = 0.07
√𝑛 √900
∴ 𝑥̅ = 0.1 + 0.007 z where z ~ N ( 0 ,1)
∴ The required probability , that the sample mean is negative is given by
𝑃( 𝑥̅ < 0 ) = P ( 0.1 + 0.07 z < 0)
= P ( 0.07 z < - 0.1 )
−0.1
= P ( z < ( 0.07 )
= P ( z < -1.43 )
= 0.50 – P ( 0 < z < 1.43 )
= 0.50 – 0.4236 = 0.0764
12.In a study of an automobile insurance a random sample of 80 body repair costs had a
mean of Rs 472.36 and the S.D of Rs 62.35. If ̅𝒙 is used as a point estimator to the true
average repair costs , with what confidence we can assert that the maximum error
doesn’t exceed Rs 10.
Sol : Size of a random sample , n = 80
The mean of random sample , 𝑥̅ = Rs 472.36

Standard deviation , 𝜎 = Rs 62.35
Maximum error of estimate , 𝐸 𝑚𝑎𝑥 = Rs 10
𝜎
We have 𝐸 𝑚𝑎𝑥 =𝑍𝛼⁄2 .
√𝑛
𝐸 𝑚𝑎𝑥 .√𝑛 10 √80 89.4427
i.e., 𝑍𝛼⁄2 = = = = 1.4345
𝜎 62.35 62.35
∴ 𝑍𝛼⁄2 = 1.43
The area when z = 1.43 from tables is 0.4236

𝛼
∴ = 0.4236 i.e ., 𝛼 = 0.8472
2
∴ confidence = (1- 𝛼 ) 100% = 84.72 %
Hence we are 84.72% confidence that the maximum error is Rs. 10
13.If we can assert with 95% that the maximum error is 0.05 and P = 0.2 find the size of
the sample.
Sol : Given P =0.2 , E = 0.05
We have Q = 0.8 and 𝑍𝛼⁄2 = 1.96 ( 5% LOS )

𝑃𝑄
We know that maximum error , E = 𝑍𝛼⁄2 √ 𝑛
0.2 x 0.8
⇒ 0.05 = 1.96 √ 𝑛

0.2 x 0.8 x (1.96)2
⇒ Sample size , n = (0.05) 2
= 246
14.The mean and standard deviation of a population are 11,795 and 14,054 respectively
What can one assert with 95 % confidence about the maximum error if 𝒙 ̅ = 11,795 and
n = 50. And also construct 95% confidence interval for true mean .
Sol: Here mean of population , 𝜇 = 11795
S.D of population , 𝜎 = 14054

𝑥̅ = 11795
𝜎
𝑛 = sample size = 50 , maximum error = 𝑍𝛼⁄2 .
√𝑛
𝑍𝛼⁄2 for 95% confidence = 1.96
𝜎 14054
Max. error , 𝐸 = 𝑍𝛼⁄2 . = 1.96 . = 3899
√𝑛 √50
𝜎 𝜎
∴ Confidence interval = ( 𝑥̅ − 𝑍𝛼⁄2 . , 𝑥̅ + 𝑍𝛼⁄2 . )
√𝑛 √𝑛
= (11795-3899, 11795+3899)
= (7896, 15694)
15.Find 95% confidence limits for the mean of a normally distributed population from
which the following sample was taken 15, 17 , 10 ,18 ,16 ,9, 7, 11, 13 ,14.
15+17+10+18+16+9+7+11+13+14
Sol: We have 𝑥̅ = = 13
10
(𝑥𝑖 −𝑥̅ )2
𝑆2 = ∑ 𝑛−1
1
= [(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 +
9
(15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 + (15 − 13)2 ]
40
= 3
Since 𝑍𝛼⁄2 = 1.96 , we have
𝑠 √40
𝑍𝛼⁄2 . = 1.96 . = 2.26
√𝑛 √10.√3
𝑠
∴ Confidence limits are 𝑥̅ ± 𝑍𝛼⁄2 . = 13 ± 2.26 = ( 10.74 , 15.26 )
√𝑛
16.A random sample of 100 teachers in a large metropolitan area revealed mean weekly
salary of Rs. 487 with a standard deviation Rs.48. With what degree of confidence can
we assert that the average weekly of all teachers in the metropolitan area is between 472
to 502 ?
Sol: Given 𝜇 = 487 , 𝜎 = 48 , 𝑛 = 100

𝑥̅̅ − 𝜇
Z= 𝜎
√𝑛
𝑥̅ − 487 𝑥̅ − 487
= 48 = 4.8
√100

Standard variable corresponding to Rs. 472 is
472− 487
𝑍1 = = - 3.125
4.8
Standard vaiable corresponding to Rs. 502
502− 487
𝑍2 = = 3.125
4.8
Let x̅be the mean salary of teacher . Then
P ( 472 < ̅
x < 502 ) = P ( -3.125 < z < 3.125 )
= 2 ( 0 < z < 3.125 )

3.125
= 2 ∫0 ∅(𝑧)𝑑𝑧
= 2 ( 0.4991) = 0.9982
Thus we can ascertain with 99.82 % confidence

TUTORIAL QUESTIONS
1.If the population is 3,6,9,15,27

a)List all possible samples of size 3 that can be taken without replacement from finite
population
b)Calculate the mean of each of the sampling distribution of means
c)Find the standard deviation of sampling distribution of means
2. A population consist of five numbers 2,3,6,8 and 11. Consider all possible samples of size
two which can be drawn with replacement from this population .Find
a)The mean of the population
b)The standard deviation of the population
c)The mean of the sampling distribution of means and
d) The standard deviation of the sampling distribution of means
3.A random sample of size 100 is taken from a population with 𝜎 = 5.1 . Given that the
sample mean is 𝑥̅ = 21.6 Construct a 95% confidence limits for the population mean .
4.A normal population has a mean of 0.1 and standard deviation of 2.1 . Find the probability
that mean of a sample of size 900 will be negative .
5.A random sample of size 64 is taken from a normal population with 𝜇 = 51.4
and 𝜎 = 6.8.What is the probability that the mean of the sample will
a) exceed 52.9
b) fall between 50.5 and 52.3 c) be less than 50.6.

ASSIGNMENT QUESTIONS
1. A manufacturer claimed that at least 95% of the equipment which he supplied to factory
conformed to specifications . An examination of a sample of 200 pieces of equipment
revealed that 180 were faulty .Test his claim at 5% an 1% LOS.
2. Write about i) critical region ii) one tailed and two tailed test
3. Define sample. Explain the different methods that are involved in selecting the sample.
4. Explain about i) Type I error ii) Type II error
5.a)Explain the five step procedure for testing of hypothesis
b) Explain about i) point estimation ii) interval estimation

UNIT 5
STATISTICAL INFERENCES

OBJECTIVE
To make inferences about a population from sample data
(large and small samples) using probability theory
OUTCOME
Draw statistical inference using samples of a given size which is
taken from a population and to apply statistical methods for analyzing
experimental data.

STATISTICAL INFERENCES
Large Samples: Let a random sample of size n >30 is defined as large sample.
Applications of Large Samples
Test of Significance of a Single Mean

Let a random sample of size n, x̅ be the mean of the sample and 𝜇 be the population mean.
1. Null hypothesis: 𝐻0 : There is no significant difference in the given population mean

value say ‘𝜇′0 .
i.e 𝐻0 : µ = 𝜇0
2. Alternative hypothesis: 𝐻1 :There is some significant difference in the given population

mean value.
i.e
a)𝐻1 : µ ≠ µ0 (Two –tailed)
b) 𝐻1 : µ > µ0 (Right one tailed)
c) 𝐻1 : µ > µ0 (Left one tailed)
3. Level of significance: Set the LOS α
𝑥̅ −µ0 𝑥̅ −µ0
4. Test Statistic: 𝑧𝑐𝑎𝑙 = 𝜎/ (OR) 𝑧𝑐𝑎𝑙 = 𝑠/
√𝑛 √𝑛
5. Decision /conclusion : If zcal value < 𝑧∝ value , accept 𝐻0 otherwise reject 𝐻0
CRITICAL VALUES OF Z
LOS ∝ 1% 5% 10%
µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645
µ> µ0 Z>2.33 z>1.645 Z>1.28
µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28
NOTE: Confidence limits for the mean of the population corresponding to the given sample.
𝜇 = 𝑋̅ ± 𝑍∝⁄2 ( S.E of 𝑋
̅ ) i.e,
𝜎 𝜀
𝜇 = 𝑋̅ ± 𝑍∝⁄2 (√𝑛) (or) 𝜇 = 𝑋̅ ± 𝑍∝⁄2 (√𝑛)
2. Test of Significance for Difference of Means of two Large Samples
𝑥1 & ̅̅̅
Let ̅̅̅ 𝑥2 be the means of the samples of two ramdom sizes 𝓃1 & 𝓃2 drawn from two
populations having means 𝜇1 &𝜇2 and SD’s 𝜎1 &𝜎2
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬: 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative hypothesis :: a) H1 : 𝜇1 ≠ 𝜇2 (Two Tailed)
b) H1 : 𝜇1 < µ2 (Left one tailed)

c) H1 : 𝜇1 > µ2 (Right one tailed)

iii) Level of Significance: Set the LOS α
1 ̅ −×
(× 2̅ )−𝛿 ̅ 1 −×
(× ̅ 2)−𝛿
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 𝑆𝐸 𝑜𝑓 ̅ −×
(× ̅ )
=-
1 2 𝜎2 𝜎 2
√ 1+ 2
𝑛1 𝑛2
Where 𝛿 = 𝜇1 − 𝜇2 ( where given constant)

Other wise 𝛿 = 𝜇1 − 𝜇2 =0
̅ 1−×
× ̅ 2 −𝛿 ̅ −𝑋
𝑋 ̅2
𝑍𝑐𝑎𝑙 = if 𝜎21 = 𝜎22 = 𝜎2 then 𝑍𝑐𝑎𝑙 = 𝜎 11 1
2
𝜎 𝜎 2 √ 𝑛 +𝑛
√ 1+ 2 1 2
𝑛1 𝑛2
Critical value of Z from normal table at the LOS α

v) Decision: If |𝑍𝑐𝑎𝑙 | < 𝑍𝑡𝑎𝑏, accept H0 otherwise reject H0
LOS ∝ 1% 5% 10%
µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645
µ> µ0 Z>2.33 z>1.645 Z>1.28
µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28
NOTE: Confidence limits for difference of means

̅1 − 𝑋
𝜇1 − 𝜇2 = (𝑋 ̅ 2 ) ± 𝑧∝⁄2 [𝑆. 𝐸 𝑜𝑓 (𝑋
̅1 − 𝑋
̅ 2 )]
𝜎12 𝜎22
= (𝑋̅1 − 𝑋̅2 ) ± 𝑧∝⁄2 [√ + ]
𝑛1 𝑛1
3. Test of Significance for Single Proportions
Suppose a random sample of size n has a sample proportion p of members possessing

a certain attribute (proportion of successes). To test the hypothesis that the proportion
P in the population has a specified value P0 .
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃 = 𝑃0
ii) Alternative hypothesis : a) H1 : P≠ 𝑃0 (Two Tailed test )
b) H1 ∶ 𝑃 < 𝑃0 (Left one- tailed)
c) H1 ∶ P > 𝑃0 (Right one tailed)
𝑝−𝑃
iii) Test statistic :𝑍𝑐𝑎𝑙 = when P is the Population proportion 𝑄 = 1 − 𝑃
√𝑃𝑄
𝑛
iv) At specified LOS ∝, critical value of Z

v) Decision: If |𝑧𝑐𝑎𝑙 | < 𝑍𝑡𝑎𝑏 , accept H0 otherwise reject H0
LOS ∝ 1% 5% 10%
µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645
µ> µ0 Z>2.33 z>1.645 Z>1.28
µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28
NOTE : Confidence limits for population proportion

P = P ± Z∝ (S E of P)
2

pq
= P ± Z∝ ( √ n )
2
4. Test for Equality of Two Proportions (Populations)
Let p1 and p2 be the sample proportions in two large random samples of sizes n1 & n2
drawn from two populations having proportions P 1 & P2
i) 𝐍𝐮𝐥𝐥 𝐡𝐲𝐨𝐩𝐨𝐭𝐡𝐞𝐬𝐢𝐬 : 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative hypothesis : a) H1 : 𝑃1 ≠ 𝑃2 (Two Tailed)

b) H1 : 𝑃1 < 𝑃2 (Left one tailed)
c) H1 : 𝑃1 > 𝑃2 (Right one tailed)

(𝑃1−𝑃2)−(𝑃1 −𝑃2 )
iii) Test statistic :𝑍𝑐𝑎𝑙 = if (P1-P2) is given.
𝑃1𝑄 𝑃
√ 1 1𝑄1
𝑛1 + 𝑛2
If given only sample proportions then
𝑝1−𝑝2 x x
𝑍𝑐𝑎𝑙 = 𝑃1 𝑞 1 𝑃2 𝑞 1
where p1 = n1 & p1 = n2
√ + 𝑛 1 2
𝑛1 2
OR
p1 − p2 n1p1 +n2p2 x1 +x2

Zcal = 1 1
Where p = n1+n2
= n1+n2
and q = 1- p
√pq( + )
n1 n2
iv) At specified LOS ∝ critical value of ‘Z’

v) Decision: If |𝑍𝑐𝑎𝑙 | < 𝑍𝑇𝑎𝑏 , accept H0 otherwise reject H0
LOS ∝ 1% 5% 10%
µ≠ µ0 /Z/>2.58 /Z/>1.96 /Z/>1.645
µ> µ0 Z>2.33 z>1.645 Z>1.28
µ< µ0 Z<-2.33 Z<-1.645 Z<-1.28
NOTE: Confidence limits for difference of population proportions
𝑃1 − 𝑃2 = (𝑝1 − 𝑝2 ) ± 𝑍∝ (𝑆 . 𝐸 𝑜𝑓 𝑃1 − 𝑃2 )
2
Problems:
1. A sample of 64 students have a mean weight of 70 kgs . Can this be regarded as
asample mean from a population with mean weight 56 kgs and standard
deviation 25 kgs.
𝑥 = mean of he sample = 70 kgs
Sol : Given ̅
𝜇 = Mean of the population = 56 kgs
𝜎 = S.D of population = 25 kgs

and 𝑛 = Sample size = 64
i) Null Hypothesis 𝐻0 : A Sample of 64 students with mean weight 70 kgs be

regarded as a sample from a population with mean weight 56 kgs and standard
deviation 25 kgs. i.e., 𝐻0 : 𝜇 = 70 kgs
ii) Alternative Hypothesis 𝐻1 : Sample cannot be regarded as one coming from the
population . i.e., 𝐻1 : 𝜇 ≠ 70 kgs ( Two –tailed test )
iii) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )
𝑥̅ − 𝜇 70−56
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 25 = 4.48
√𝑛 √64
v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

∴ Sample cannot be regarded as one coming from the population
2. In a random sample of 60 workers , the average time taken by them to get to

work is 33.8 minutes with a standard deviation of 6.1 minutes . Can we reject the
null hypothesis 𝝁 = 32.6 in favor of alternative null hypothesis 𝝁 > 32.6 at 𝜶 =
0.05 LOS
𝑥 = 33.8 , 𝜇 = 32.6 and 𝜎 = 6.1
Sol : Given n = 60 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 32.6
ii) Alternative Hypothesis 𝐻1 : 𝜇 > 32.6 ( Right one tailed test )
𝑥̅ − 𝜇 33.8−32.6 1.2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 6.1 = 0.7875 = 1.5238
√𝑛 √60
v) Conclusion: Since 𝑍 𝑐𝑎𝑙 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
3. A sample of 400 items is taken from a population whose standard deviation is 10

. The mean of the sample is 40 . Test whether the sample has come from a
population with mean 38 . Also calculate 95% confidence limits for the
population .
𝑥 = 40 , 𝜇 = 38 and 𝜎 = 10
Sol : Given n = 400 , ̅
i) Null Hypothesis 𝐻0 : 𝜇 = 38
ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 38 ( Two –tailed test )
𝑥̅ − 𝜇 38−40 −2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 10 = 0.5 = - 4
√𝑛 √400
v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣_𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

i.e., the sample is not from the population whose is 38.
𝜎 𝜎
∴ 95% confidence interval is (𝑥
̅ − 1.96. , 𝑥
̅ + 1.96. )
√𝑛 √𝑛
1.96(10) 1.96(10)
i.e., (40 − √400
, 40 + √400 )
1.96(10) 1.96(10)
= (40 − 20
, 40 + 20 )
= ( 40 – 0.98 , 40 + 0.98 )
= ( 39.02 , 40.98 )

4. An insurance agent has claimed that the average age of policy holders who issue
through him is less than the average for all agents which is 30.5. A random
sample of 100 policy holders who had issued through him gave the following age
distribution .
Age 16-20 21-25 26-30 31-35 36-40
No# of 12 22 20 30 16
persons
Calculate the arithmetic mean and standard deviation of this distribution and
use these values to test his claim at 5% los.
Sol : Take A = 28 where A – Assumed mean
𝑑𝑖 = 𝑥 𝑖 – A
ℎ ∑ 𝑓𝑖 𝑑𝑖
𝑥̅ = A + 𝑁
5 x 16
= 28 + 100
= 28.8
∑ 𝑓𝑑2 ∑ 𝑓𝑑 2 164 16 2
S.D : S = h √ 𝑁
− ( 𝑁
) = 5. √
100
− (100 ) = 6.35
i) Null Hypothesis 𝐻0 : The sample is drawn from population with mean 𝜇

ii) i.e., 𝐻0 : 𝜇 = 30.5 years
iii) Alternative Hypothesis 𝐻1 : 𝜇 < 30.5 ( Left one –tailed test )
iv) Level of significance : 𝛼 = 0.05 (𝑍𝛼 = 1.645 )
𝑥̅ − 𝜇 28.8−30.5
v) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 6.35 = − 2.677
√𝑛 √100
vi) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

i.e., the sample is not drawn from the population with 𝜇 = 30.5 years .
5. An ambulance service claims that it takes on the average less than 10 minutes to
reach its destination in emergency calls . A sample of 36 calls has a mean of 11
minutes and the variance of 16 minutes .Test the claim at 0.05 los?
𝑥 =11 , 𝜇 = 10 and 𝜎 = √16 = 4
Sol : Given n = 36 , ̅
ii) Alternative Hypothesis 𝐻1 : 𝜇 < 10 ( Left one –tailed test )
𝑥̅ − 𝜇 11−10 6
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 = 4 = 4 = 1.5
√𝑛 √36
v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
6. The means of two large samples of sizes 1000 and 2000 members are 67.5 inches
and 68 inches respectively . Can the samples be regarded as drawn from the
same population of S.D 2.5 inches.
Sol: Let 𝜇1 and 𝜇2 be the means of the two populations

Given 𝑛1 = 1000 , 𝑛2 = 2000 and 𝑥̅1 = 67.5 inches , 𝑥
̅2 = 68 inches
Population S.D, 𝜎 = 2.5 inches
i) Null Hypothesis 𝐻0 :The samples have been drawn from the same population of
S.D 2.5 inches
i.e., 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 ( Two – Tailed test)
̅
𝑋1−𝑋2 ̅ 67.5−68 −0.5
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = 𝜎 1 1
= 1 1
= 0.0968 = -5.16
√𝑛 + 𝑛 √( ) 2(1000+2000 )
1 2 2.5
v) Conclusion: Since |𝑍 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0
Hence , we conclude that the samples are not drawn from the same population of
S.D 2.5 inches.
7. Samples of students were drawn from two universities and from their weights in
kilograms , mean and standard deviations are calculated and shown below.
Make a large sample test to test the significance of the difference between the
means.
Mean S .D Size of the sample
University A 55 10 400
University B 57 15 100

Given 𝑛1 = 400 , 𝑛2 = 100 and 𝑥 ̅1 = 55 kgs , 𝑥
̅2 = 57 kgs
𝜎1 = 10 and 𝜎2 = 15
i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 ( Two – Tailed test)
𝑥̅ 1− 𝑥̅ 2 55− 57 −2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = = 1 9
= -1.26
𝜎 2
𝜎 2 2 2 √ +
√ 1 + 2 √10 +15 4 4
𝑛1 𝑛2 400 100

Hence , we conclude that there is no significant difference between the means
8. The average marks scored by 32 boys is 72 with a S.D of 8 . While that for 36
girls is 70 with a S.D of 6. Does this data indicate that the boys perform better
than girls at 5% los ?
Given 𝑛1 = 32 , 𝑛2 = 36 and ̅ 𝑥1 = 72 , ̅
𝑥2 = 70
𝜎1 = 8 and 𝜎2 = 6
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
𝑥̅ 1− 𝑥̅ 2 72− 70 2
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = = √2+1 = 1.1547
𝜎 2
𝜎 2 2 2
√ 1 + 2 √8 +6
𝑛1 𝑛2 32 36

Hence , we conclude that the performance of boys and girls is the same

9. A sample of the height of 6400 Englishmen has a mean of 67.85 inches and a S.D
of 2.56 inches while another sample of heights of 1600 Austrians has a mean of
68.55 inches and S.D of 2.52 inches. Do the data indicate that Austrians are on
the average taller than the Englishmen ? (Use 𝜶 𝒂𝒔 𝟎. 𝟎𝟏)
Sol : Let 𝜇1 and 𝜇2 be the means of the two populations
Given 𝑛1 = 6400 , 𝑛2 = 1600 and ̅ 𝑥1 = 67.85 , ̅
𝑥2 = 68.55
𝜎1 = 2.56 and 𝜎2 = 2.52
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 ( Left One Tailed test)
iii) Level of significance : 𝛼 = 0.01 (𝑍𝛼 = - 2.33 )
𝑥̅ 1− 𝑥̅ 2 67.85− 68.55
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = =
𝜎 2𝜎 2 2 2
√ 1 + 2 √2.56 +2.52
𝑛1 𝑛2 6400 1600
67.85 − 68.55
=
6.5536 6.35
√ 6400 + 1600
− 0.7 − 0.7
= √0.001+0.004 = 0.0707 - 9.9
v) Conclusion: Since |Z cal | value > Zα value , we reject 𝐻0
Hence , we conclude that Australians are taller than Englishmen.
10. At a certain large university a sociologist speculates that male students spend
considerably more money on junk food than female students. To test her
hypothesis the sociologist randomly selects from records the names of 200
students . Of thee , 125 are men and 75 are women . The mean of the average
amount spent on junk food per week by the men is Rs. 400 and S.D is 100. For
the women the sample mean is Rs. 450 and S.D is 150. Test the hypothesis at 5 %
los ?
Given 𝑛1 = 125 , 𝑛2 = 75 and ̅𝑥1 = Mean of men = 400 , ̅𝑥2 = Mean of women = 450
𝜎1 = 100 and 𝜎2 = 150
ii) Alternative Hypothesis 𝐻1 : 𝜇1 > 𝜇2 ( Right One Tailed test)
𝑥̅ 1− 𝑥̅ 2 400− 450
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = =
𝜎 2 𝜎 2 2 2
√ 1 + 2 √100 +150
𝑛1 𝑛2 125 75
− 50
=
√80 + 300
− 50 − 50
= √380
= 19.49
= - 2.5654
v) Conclusion: Since Zcal value < Zα value , we accept 𝐻0
Hence , we conclude that difference between the means are equal
11. The research investigator is interested in studying whether there is a significant

difference in the salaries of MBA grads in two cities. A random sample of size
100 from city A yields an average income of Rs. 20,150 . Another random sample

of size 60 from city B yields an average income of Rs. 20,250. If the variance are
given as 𝝈𝟏 𝟐 = 40,000 and
𝝈𝟐 𝟐 = 32,400 respectively . Test the equality of means and also construct 95%
confidence limits.
Given 𝑛1 = 100 , 𝑛2 = 60 and ̅ 𝑥1 = Mean of city A = 20,150 , 𝑥
̅2 = Mean of city B =
20,250
𝜎1 2 = 40,000 and 𝜎2 2 = 32,400
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2 (Two -Tailed test)
𝑥̅ 1− 𝑥̅ 2 20,150− 20,250
iv) Test Statistic : 𝑍 𝑐𝑎𝑙 = = 40000 32400
𝜎 2
𝜎 2 √ + 60
√ 1 + 2 100
𝑛1 𝑛2
100
=
√400 + 540
100
= 30.66
= 3.26
v) Conclusion: Since Zcal value > Zα value , we reject 𝐻0
Hence , we conclude that there is a significant difference in the salaries of MBA
grades two cities.
𝜎 2 𝜎2 2
∴ 95% confidence interval is𝜇1 - 𝜇2 = (𝑥 𝑥2 )± 1.96 √ 𝑛1 +
̅̅̅1 − ̅̅̅
𝑛2
1
40000 32400
= (20,150 – 20,250) )± 1.96√ 100 + 60
= (39.90, 160.09)
12. A die was thrown 9000 times and of these 3220 yielded a 3 or 4. Is this consistent
with the hypothesis that the die was unbiased?
Sol : Given n = 9000
P = Population of proportion of successes
1 1 2 1
= P( getting a 3 or 4 ) = 6 + 6 = 6 = 3 0.3333
Q = 1- P = 0.6667
3220
P = Proportion of successes of getting 3 or 4 in 9000 times = 9000 = 0.3578
i) Null Hypothesis 𝐻0 : The die is unbiased
i.e., 𝐻0 : P = 0.33
ii) Alternative Hypothesis 𝐻1 : The die is biased
i.e., 𝐻1 : P ≠ 0.33 ( Two –Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = 1.96 )

𝑝−𝑃 0.3578−0.3333
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = (0.3333) (0.6667)
= 4.94
√𝑃𝑄
𝑛
√
9000
v)Conclusion: Since Zcal value > Zα value , we reject 𝐻0
Hence , we conclude that the die is biased.

13. In a random sample of 125 cool drinkers , 68 said they prefer thumsup to Pepsi .
Test the null hypothesis P = 0.5 against the alternative hypothesis hypothesis P >
0.5?
𝑥 68
Sol : Given n = 125 , x = 68 and p = 𝑛 = 125 = 0.544
i) Null Hypothesis 𝐻0 : P = 0.5
ii) Alternative Hypothesis 𝐻1 : P > 0.5( Right One Tailed test)
𝑝−𝑃 0.544−0.5
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = (0.5)(0.5)
= 0.9839
√𝑃𝑄
𝑛
√
125
v) Conclusion: Since Zcal value < Zα value , we accept 𝐻0
14. A manufacturer claimed that at least 95% of the equipment which he supplied to
a factory conformed to specifications . An experiment of a sample of 200 piece of
equipment revealed that 18 were faulty .Test the claim at 5% los ?
Sol : Given n = 200
Number of pieces confirming to specifications = 200-18 = 182
182
∴ p = Proportion of pieces confirming to specification = 200 = 0.91
95
P = Population proportion = 100 = 0.95

ii) Alternative Hypothesis 𝐻1 : P < 0.95( Left One Tailed test)
iii) Level of Significance : 𝛼 = 0.05 (𝑍𝛼 = -1.645 )
𝑝−𝑃 0.91−0.95
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = 0.95 x 0.05
= - 2.59
√𝑃𝑄
𝑛
√
200
v) Conclusion: We reject 𝐻0
Hence , we conclude that the manufacturer’s claim is rejected.
15. Among 900 people in a state 90 are found to be chapatti eaters . Construct 99%
confidence interval for the true proportion and also test the hypothesis for single
proportion ?
Sol: Given x = 90 , n = 900
𝑥 90 1
∴ p = 𝑛 = 100 = 10 = 0.1
And q = 1- p= 0.9
𝑝𝑞 (0.1) (0.9)
Now √ 𝑛 = √ 900
= 0.01
Confidence interval is 𝑃 = 𝑝 ± 𝑍∝ (√𝑝𝑞

𝑛
)
2
i.e., ( 0.1- 0.03 , 0.1 + 0.03 )
= ( 0.07 , 0.13 )
ii) Alternative Hypothesis 𝐻1 : P ≠0.5( Two Tailed test)
𝑝−𝑃 0.1−0.5
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = = 0.5 x 0.5
= -24.39
√𝑃𝑄
𝑛
√
900
v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 > 𝑍𝛼 value , we reject 𝐻0

16. Random samples of 400 men and 200 women in a locality were asked whether
they would like to have a bus stop a bus stop near their residence . 200 men and
40 women in favor of the proposal . Test the significance between the difference
of two proportions at 5% los ?
Sol: Let 𝑃1 and 𝑃2 be the population proportions in a locality who favor the bus stop
Given 𝑛1 = Number of men = 400
𝑛2 = number of women = 200
𝑥1 = Number of men in favor of the bus stop = 200
𝑥2 = Number of women in favor of the bus stop 40
𝑥 200 1 𝑥 40 1
∴ 𝑝1 = 𝑛1 = = 2 and 𝑝2 = 𝑛2 = 200 = 5
1 400 2
i) Null Hypothesis 𝐻0 : 𝑃1 = 𝑃2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 ≠ 𝑃2 ( Two Tailed test)
𝑝 − 𝑝
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1 21
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 200+40 240 2
We have p = 𝑛1 +𝑛2
= 𝑛1 +𝑛2
= 400+200 = 600
=5
3
q = 1- p = 5
0.5−0.2
= 1 1
= 7.07
√(0.4) (0.6)( + )
400 200
v) Conclusion: Since |Zcal |value > Zα value , we reject 𝐻0

Hence we conclude that there is difference between the men and women in their
attitude towards the bus stop near their residence.
17. A machine puts out 16 imperfect articles in a sample of 500 articles . After the
machine is overhauled it puts out 3 imperfect articles in a sample of 100 articles .
Has the machine is improved ?
Sol : Let 𝑃1 and 𝑃2 be the proportions of imperfect articles in the proportion of
articles manufactured by the machine before and after overhauling , respectively.
Given 𝑛1 = Sample size before the machine overhauling = 500
𝑛2 = Sample size after the machine overhauling = 100
𝑥1 = Number of imperfect articles before overhauling = 16
𝑥2 = Number of imperfect articles after overhauling = 3
𝑥 16 𝑥 3
∴ 𝑝1 = 𝑛1 = = 0.032 and 𝑝2 = 𝑛2 = 100
= 0.03
1 500 2
ii) Alternative Hypothesis 𝐻1 : 𝑃1 > 𝑃2 ( Left one Tailed test)
𝑝 − 𝑝
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 16+3 19
= 𝑛1 +𝑛2
= 500+100 = 600
= 0.032
q = 1- p = 0.968
0.032−0.03
= 1 1
√(0.032) (0.968)( + )
500 100

0.002
0.019
= 0.104
v) Conclusion: Since |Zcal |value < Zα value , we accept 𝐻0
Hence we conclude that the machine has improved.
18. In an investigation on the machine performance the following results are

obtained .
No# of units inspected No# of defectives
Machine 1 375 17
Machine 2 450 22
Test whether there is any significant performance of two machines at 𝜶 = 0.05
Sol: Let 𝑃1 and 𝑃2 be the proportions of defective units in the population of units inspected
in machine 1 and Machine 2 respectively.
Given 𝑛1 = Sample size of the Machine 1 = 375

𝑛2 = Sample size of the Machine 2 = 450
𝑥1 = Number of defectives of the Machine 1 = 17
𝑥2 = Number of defectives of the Machine 2 = 22
𝑥 17 𝑥 22
∴ 𝑝1 = 𝑛1 = = 0.045 and 𝑝2 = 𝑛2 = 450
= 0.049
1 375 2
𝑝 − 𝑝
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 17+22 39
= 𝑛1 +𝑛2
= 375+450 = 825
= 0.047
q = 1- p = 1- 0.047 = 0.953
0.045−0.049
= 1 1
√(0.047) (0.953)( + )
375 450
= - 0.267
v) Conclusion: Since |Zcal |value < Zα value , we accept 𝐻0
Hence we conclude that there is no significant difference in performance of
machines.
19. A cigarette manufacturing firm claims that its brand A line of cigarettes outsells
its
brand B by 8% . If it is found that 42 out of 200 smokers prefer brand A and 18
out of another sample of 100 smokers prefer brand B . Test whether 8%
difference is a valid claim?
Sol: Given 𝑛1 = 200
𝑛2 = 100
𝑥1 = Number of smokers preferring brand A= 42
𝑥2 = Number of smokers preferring brand B = 18
𝑥 42 𝑥 18
∴ 𝑝1 = 𝑛1 = = 0.21 and 𝑝2 = 𝑛2 = 100
= 0.18
1 200 2

and 𝑃1 - 𝑃2 = 8% = 0.08
i) Null Hypothesis 𝐻0 : 𝑃1 - 𝑃2 = 0.08

ii) Alternative Hypothesis 𝐻1 : 𝑃1 - 𝑃2 ≠ 0.08 ( Two Tailed test)
(𝑝1 − 𝑝2)−( 𝑃1 − 𝑃2)
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 42+18 60

= 𝑛1 +𝑛2
= 200+100 = 300
= 0.2
q = 1- p = 1- 0.2 = 0.8
(0.21−0.18) −0.08
𝑍𝑐𝑎𝑙 = 1 1
√(0.2) (0.8)( + )
200 100
−0.05
= 0.0489
= - 1.02
v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
Hence we conclude that 8% difference in the sale of two brands of cigarettes is a
valid claim.
20. In a city A , 20% of a random sample of 900 schoolboys has a certain slight
physical defect . In another city B ,18.5% of a random sample of 1600 school
boys has the same defect . Is the difference between the proportions significant at
5% los?
Sol: Given 𝑛1 = 900
𝑛2 = 1600
𝑥1 = 20% of 900 = 180
𝑥2 = 18.5% of 1600 = 296
𝑥 180 𝑥 296
∴ 𝑝1 = 𝑛1 = = 0.2 and 𝑝2 = 𝑛2 = 1600
= 0.185
1 900 2
(𝑝1 − 𝑝2)
iv) Test Statistic : 𝑍𝑐𝑎𝑙 = 1 1
√𝑝𝑞( + )
𝑛1 𝑛2
𝑛1 𝑝1+𝑛2𝑝2 𝑥1 +𝑥2 180+296 476

= 𝑛1 +𝑛2
= 900+1600 = 2500
= 0.19
q = 1- p = 1- 0.19 = 0.81
0.2−0.185
𝑍𝑐𝑎𝑙 = 1 1
√(0.19) (0.81)( + )
900 1600
−0.015
= 0.01634
= - 0.918
v) Conclusion: Since |𝑍𝑐𝑎𝑙 |𝑣𝑎𝑙𝑢𝑒 < 𝑍𝛼 value , we accept 𝐻0
Hence we conclude that there is no significant difference between the proportions.

SMALL SAMPLES
Introduction When the sample size n < 30, then if is referred to as small samples. In this
sampling distribution in many cases may not be normal ie., we will not be justified in
estimating the population parameters as equal to the corresponding sample values.
Degree Of Freedom The number of independent variates which make up the statistic is
known as the degrees of freedom (d.f) and it is denoted by 𝜗.
For Example: If 𝑥1 + 𝑥2 + 𝑥3 = 50 and we assign any values to two os the variables (say
x1,x2 ), then the values of x3 will be known. Thus, the two variables are free and independent
choices for finding the third.
In general, the number of degrees of freedom is equal to the total number of

observations less the number of independent constraints imposed on the observations.
For example: in a set of data of n observations, if K is the number of independent constraints

then 𝜗 = 𝑛 − 𝑘
Student’s t-Distribution Or t-Distribution

̅ be the mean of a random sample of size n, taken from a normal population having the
Let 𝑋
̅) 2
(𝑋𝑖 −𝑋
mean 𝜇 and the variance 𝜎2 , and sample variance 𝑆2 =∑ 𝑛−1
, then
𝑥̅ −𝜇
𝑡=𝑆 is a random variable having the 𝑡 − distribution with 𝜗 = 𝑛 − 1 degrees of freedom.
⁄
√𝑛
Properties of 𝒕 − Distribution
1. The shape of 𝑡 −distribution is bell shaped, which is similar to that of normal
distribution and is symmetrical about the mean.
2. The mean of the standard normal distribution as well as 𝑡 −distribution is zero, but
the variance of 𝑡 −distrubution depends upon the parometer 𝜗 which is called the
degrees of freedom.
3. The variance of 𝑡 −distribution exceeds 1, but approaches 1 as 𝑛 → ∞.

Applications of 𝒕 – Distributions
1. To test the significance of the sample mean, When population variance

is not given:
Let 𝑥̅ be the mean of the sample and n be the size of the sample ‘𝜎’ be the standard
deviation of the population and 𝜇 be the mean of the population.
Then the student 𝑡 − distribution is defined by the statistic
𝑥̅ −𝜇
𝑡= 𝑠 if s is given directly
√𝑛−1
𝑋̅−𝜇
If ′𝜎′ is unknown, then 𝑡 = 𝑆 where
⁄
√𝑛
̅ )2
(𝑋𝑖−𝑋
𝑆2 = ∑ 𝑛−1
Note ∶ Confidence limits for mean µ = x̅ ± tα (S⁄ ) or µ = x̅ ± tα (S⁄ )

√n √n − 1
𝟐. To test the significance of the difference between means of the two

independent samples :
To test the significant difference between the sample means 𝑥
̅1 and 𝑥
̅2 of two independent
samples of sizes n1 and n2, with the same variance .

We use statistic
𝑥̅1 −𝑥̅2
𝑡= -------(1) where
√𝑆 2 (𝑛1 +𝑛1 )
1 2
∑ 𝑥1 ∑𝑥
̅1 =
𝑥 ̅2 = 2 and
,𝑥
𝑛1 𝑛2
2 1
𝑥1 2 + ∑(𝑥2 − ̅̅̅)
𝑆 =𝑛 +𝑛 −2 [∑(𝑥1 − ̅̅̅) 𝑥2 2
1 2
1
OR 𝑆2 =𝑛 +𝑛 −2 [(𝑛1 𝑠21 ) + (𝑛2 𝑠22 )]
1 2
Where s1 and s2 are sample standard deviations.
Note: Confidence limits for difference of means : 𝜇1 − 𝜇2 = (𝑥

̅1 − 𝑥
̅2 )
± t α (√𝑆 2 (𝑛11 + 𝑛12 ))
3.Paired t- test ( Test the significance of the difference between means of

two dependent samples ) :
Paired observations arise in many practical situations where each homogenous experimental
unit receives both population condition.
For Example: To test the effectiveness of ‘drug’ some // person’s blood pressure is measured
before and after the intake of certain drug. Here the individual person is the experimental unit
and the two populations are blood pressure “before” and “after” the drug is given
Paired t-test is applied for n paired observations by taking the differences d1,d2 ------dn of the
paired data. To test whether the differences di from a random sample of a population with
mean 𝜇.
𝑑̅ 1 1 2
𝑡=𝑠 𝑤ℎ𝑒𝑟𝑒 𝑑̅ = 𝑛 𝜖 𝑑𝑖 and 𝑠2 = 𝑛−1 ∑(𝑑 − 𝑑
̅)
⁄ 𝑛
√
Problems:
1. A sample of 26 bulbs gives a mean life of 990 hours with a S.D of 20 hours. The
manufacturer claims that the mean life of bulbs is 1000 hours . Is the sample not
upto the standard?
Sol: Given n = 26
𝑥 = 990
̅
𝜇 = 1000 and S.D i.e., s = 20
i) Null Hypothesis : 𝐻0 : 𝜇 = 1000
ii) Alternative Hypothesis: 𝐻1 : 𝜇 < 1000( Left one tailed test )
(Since it is given below standard)
iii) Level of significance : 𝛼 = 0.05
t tabulated value with 25 degrees of freedom for left tailed test is 1.708
𝑥̅ − 𝜇 990−1000
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑠 = 20 = − 2.5
√𝑛−1 √25
v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | value > 𝑡𝛼 value , we reject 𝐻0

Hence we conclude that the sample is not upto the standard.

2. A random sample of size 16 values from a normal population showed a mean of 53
and sum of squares of deviations from the mean equals to 150 . Can this sample be
regarded as taken from the population having 56 as mean ? Obtain 95% confidence
limits of the mean of the population.?
Sol: a) Given n = 16
𝑥 = 53
̅
𝑥)2 = 150
𝜇 = 56 and ∑(𝑥𝑖 − ̅
̅ )2
∑(𝑥𝑖 −𝑥 150
∴ 𝑆2 = 𝑛−1
= 15
= 10 ⇒ S = √10
Degrees of freedom 𝜗 = n-1 = 16-1 =15
ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 56 (Two tailed test )
t tabulated value with 15 degrees of freedom for two tailed test is 2.13
𝑥̅ − 𝜇 53−56
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑆 = √10
= − 3.79
√𝑛 √15
v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑡𝛼 value , we reject 𝐻0

Hence we conclude that the sample cannot be regarded as taken from population.
b) The 95% confidence limits of the mean of the population are given by
𝑆
𝑥̅ ± 𝑡0.05 √𝑛 = 53 ± 2.13 × 0.79
= 53 ± 1.6827
= 54.68 and 51.31
∴ 95% confidence limits are( 51.31, 54.68 )
3. A random sample of 10 boys had the following I.Q’s : 70, 120 ,110, 101,88,
83,95,98,107 and 100.
a) Do these data support the assumption of a population mean I.Q of 100?
b) Find a reasonable range in which most of the mean I.Q values of samples of
10 boys lie
Sol: Since mean and s.d are not given
We have to determine these
x x − x̅ (x − x̅ )2
70 -27.2 739.84
120 22.8 519.84
110 12.8 163.84
101 3.8 14.44
88 -9.2 84.64
83 -14.2 201.64
95 -2.2 4.84
98 0.8 0.64
107 9.8 96.04
100 2.8 7.84
∑ 𝑥 = 972 ∑(x − x̅ )2
= 1833.60

∑𝑥 972
Mean , 𝑥
̅=
𝑛
= 10
= 97.2 and
1 1833.6
𝑆2 = 𝑛−1 ∑(x − x̅ )2 = 9
∴ S = √203.73 = 14.27

ii) Alternative Hypothesis 𝐻1 : 𝜇 ≠ 100 (Two tailed test )
t tabulated value with 9 degrees of freedom for two tailed test is 2.26
𝑥̅ − 𝜇 97.2−100
iv) Test Statistic : 𝑡 𝑐𝑎𝑙 = 𝑆 = 14.27 = − 0.62
√𝑛 √10
v) Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 < 𝑡𝛼 value , we accept 𝐻0

Hence we conclude that the data support the assumption of mean I.Q of 100 in the
population.
b) The 95% confidence limits of the mean of the population are given by
𝑆
𝑥̅ ± 𝑡0.05 √𝑛 = 97.2 ± 2.26× 4.512
= 97.2 ± 10.198
= 107.4 and 87
∴ 95% confidence limits are( 87, 107.4 )
4. Samples of two types of electric bulbs were tested for length of life and following
data were obtained
Type 1 Type 2
Sample number , 𝒏𝟏 = 8 𝒏𝟐 = 7
Sample mean , ̅̅ 𝒙̅̅𝟏 = 1234 𝒙̅̅𝟐 = 1036
̅̅
Sample S.D , 𝒔𝟏 = 36 𝒔𝟐 = 40
Is the difference in the mean sufficient to warrant that type 1 is superior to type
2 regarding length of life .
Sol: i) Null Hypothesis 𝐻0 : The two types of electric bulbs are identical
i.e., 𝐻0 : 𝜇1 = 𝜇2
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1− 𝑥̅ 2
iii)Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
√𝑠( + )
𝑛1 𝑛2
𝑛1 𝑠1 2+𝑛1 𝑠1 2
Where 𝑆2 = 𝑛1 +𝑛2
1
= 8+7−2(8(36)2 + 7(40)2 ) = 1659.08
1234− 1036
∴t= 1 1
= 9.39
√1659.08 ( + )
8 7
iv)Degrees of freedom = 8+7-2 =13 ,tabulated value of t for 13 d.f at 5% los is 2.16
v)Conclusion: Since |𝑡 𝑐𝑎𝑙 | 𝑣𝑎𝑙𝑢𝑒 > 𝑡𝛼 value , we reject 𝐻0
Hence we conclude that the two types 1 and 2 of electric bulbs are not identical .

5. Two horses A and B were tested according to the time to run a particular track with
the following results .
Horse A 28 30 32 33 33 29 34
Horse B 29 30 30 24 27 29
Test whether the two horses have the same running capacity
Sol: Given 𝑛1 = 7 , 𝑛2 = 6
We first compute the sample means and standard deviations
1
𝑥
̅ = Mean of the first sample = ( 28 + 30 + 32 + 33 + 33 + 29 + 34)
7
1
= 7 (219) = 31.286
1
𝑦
̅ = Mean of the second sample = ( 29 + 30 + 30 + 24 + 27 + 29 )
6
1
= 6 (169) = 28.16
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
28 -3.286 10.8 29 0.84 0.7056
30 -1.286 1.6538 30 1.84 3.3856
32 0.714 0.51 30 1.84 3.3856
33 1.714 2.94 24 -416 17.3056
33 1.714 2.94 27 -1.16 1.3456
29 -2.286 5.226 29 0.84 0.7056
34 2.714 7.366
∑𝑥 𝑥)2
∑(𝑥 − ̅ ∑𝑦 ̅)2
∑(𝑦 − 𝑦
= 219 = 31.4358 = 169 = 26.8336
1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2
1
= 11 [31.4358 + 26.8336]
1
= 11 (58.2694)
= 5.23
∴ S = √5.23 = 2.3
ii) Alternative Hypothesis 𝐻1 : 𝜇1 ≠ 𝜇2
𝑥̅ 1− 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2
31.286 − 28.16
= = 2.443
1 1
2.3 (√7 + 6)
∴ 𝑡𝑐𝑎𝑙 = 2.443
iv)Degrees of freedom = 7+6-2 =11
Tabulated value of t for 11 d.f at 5% los is 2.2

Conclusion: Since |t cal | value > tα value , we reject 𝐻0
Hence we conclude that both horses do not have the same running capacity.
6. Ten soldiers participated in a shooting competition in the first week. After intensive
training they participated in the competition in the second week . Their scores
before and after training are given below :
Scores 67 24 57 55 63 54 56 68 33 43
before
Scores 70 38 58 58 56 67 68 75 42 38
after
Do the data indicate that the soldiers have been benefited by the training.
Sol: Given 𝑛1 = 10 , 𝑛2 = 10
We first compute the sample means and standard deviations
1
𝑥
̅ = Mean of the first sample =
10
(67 + 24 + 57 +55+63+54+56+68+33+43)
1
= 10 (520) = 52
1
𝑦
̅ = Mean of the second sample =
10
(70+38+58+58+56+67+68+75+42+38)
1
= 10 (570) = 57
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
67 15 225 70 13 169
24 -28 784 38 -19 361
57 5 25 58 1 1
55 3 9 58 1 1
63 11 121 56 -1 1
54 2 4 67 10 100
56 4 16 68 11 121
68 16 256 75 18 324
33 -19 361 42 -15 225
43 -9 81 38 -19 361
∑ 𝑥 = 520 𝑥)2
∑(𝑥 − ̅ ∑ 𝑦 = 570 ̅)2
∑(𝑦 − 𝑦
= 1882 = 1664
1
Now 𝑆2 = 𝑛 ̅)2 + ∑(𝑦 − 𝑦)2 ]
[(∑(𝑥 − 𝑥
1 +𝑛2 −2
1
= 18 [1882 + 1664]
1
= 18 (3546)
= 197
∴ S = √197 = 14.0357
ii) Alternative Hypothesis 𝐻1 : 𝜇1 < 𝜇2 (Left one tailed test)
𝑥̅ 1− 𝑥̅ 2
iii) Test Statistic : 𝑡𝑐𝑎𝑙 = 1 1
𝑆√(𝑛 +𝑛 )
1 2

52 − 57
=
1 1
14.0357 (√10 + 10)
3546
= 18
= −0.796
∴ 𝑡𝑐𝑎𝑙 = -0.796
iv)Degrees of freedom = 10+10-2 =18
Tabulated value of t for 18 d.f at 5% los is -1.734
Conclusion: Since |t cal | value < |tα | value , we accept 𝐻0
Hence we conclude that the soldiers are not benefited by the training.
7. The blood pressure of 5 women before and after intake of a certain drug are given
below:
Before 110 120 125 132 125
After 120 118 125 136 121
Test whether there is significant change in blood pressure at 1% los?
Sol: Given n = 5
̅
𝑑
iii) Test Statistic 𝑡𝑐𝑎𝑙 = 𝑠
⁄ 𝑛
√
where d̅ =
∑d 1 ̅ )2
and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
n
B.P before training B.P after training 𝑑 = 𝑦−𝑥 𝑑 − 𝑑̅ ̅ )2
(𝑑 − 𝑑
110 120 10 8 64
120 118 -2 -4 16
123 125 2 0 0
132 136 4 2 4
125 121 -4 -6 36
∑ 𝑑 = 10 ̅ )2 =
∑(𝑑 − 𝑑
120
10 120
̅̅̅̅ =
∴𝑑 5
= 2 and 𝑆2 = 4
= 30
∴ S = 5.477
̅
𝑑 2
𝑡𝑐𝑎𝑙 = 𝑠 = 5.477 = 0.862
⁄ 𝑛 ⁄
√ √5
iv) Degrees of freedom = 5-1= 4
Hence we conclude that there is no significant difference in Blood pressure after intake of a
certain drug.

8. Memory capacity of 10 students were tested before and after training . State
whether the training was effective or not from the following scores.
Sol : i) Null Hypothesis 𝐻0 : 𝜇1 = 𝜇2
̅
𝑑
iii) Test Statistic 𝑡𝑐𝑎𝑙 = 𝑠
⁄ 𝑛
√
where d̅ =
∑d 1 ̅ )2
and 𝑆2 = 𝑛−1 ∑(𝑑 − 𝑑
n
Before(𝑥) After(𝑦) 𝑑 = 𝑦−𝑥 𝑑2
12 15 -3 9
14 16 -2 4
11 10 1 1
8 7 1 1
7 5 2 4
10 12 -2 4
3 10 -7 49
0 2 -2 4
5 3 2 4
6 8 -2 4
2
∑𝑑 ∑𝑑
= −12 = 84
̅= −12
𝑑 10
= -1.2
84− (−1.2) 2x 10
𝑆2 = 9
= 7.73
∴ S = 2.78
̅
𝑑 −1.2
𝑡𝑐𝑎𝑙 = 𝑠 = 2.78 = -1.365 and d.f = n-1 = 9
⁄ 𝑛 ⁄
√ √10
Hence we conclude that there is no significant difference in memory capacity after the
training program.

Chi-Square (𝝌𝟐 ) Distribution
Chi square distribution is a type of cumulative probability distribution . probability
distributions provide the probability of every possible value that may occur . Distributions
that are cumulative give the probability of a random variable being less than or equal to a
particular value. Since the sum of the probabilities of every possible value must equal one ,
the total area under the curve is equal to one . Chi square distributions vary depending on the
degrees of freedom. The degrees of freedom is found by subtracting one from the number of
categories in the data .
Applications of Chi – Square Distribution:
Chi – Square test as a test of goodness of fit :

𝝌𝟐 – test enables us to ascertain how well the theoretical distributions such as
binomial, Poisson, normal etc, fit the distributions obtained from sample data. If the
calculated value of 𝝌𝟐 is less than the table value at a specified level of generally 5%
significance, the fit is considered to be good.
If the calculated value of 𝝌𝟐 is greater than the table value, the fit is considered to be poor.
i) Null hypothesis: H0 : There is no difference in given values and calculated values
ii)Altenative hypothesis: H1 : There is some difference in given values and calculated

values
(O−E) 2
iii) Test Statistic 𝛘𝟐 cal = ∑ E
iv)At specified level of significance for n-1 d.f if the given problem is binomial
distribution
At specified level of significance for n-2 d.f if the given problem is Poisson distribution
v)Conclusion :If 𝝌𝟐 cal value < 𝝌𝟐 tab value , then we accept H0 , Otherwise reject H0 .

2. Chi – Square test for independence of attributes :
Definition : An attribute means a quality or characteristic
Eg: Drinking, Smoking, blindness, Honesty, beauty etc.,
An attribute may be marked by its presence or absence in a number of a given population.
Let us consider two attributes A and B.
A is divided into two classes and B is divided into two classes. The various cell frequencies
can be expressed in the following table known as 2x2 contingency table.
a b a+b
c d c+ d
a + c b + d N =a + b + c + d
The expected frequencies are given by
(𝑎 + 𝑐 )(𝑎 + 𝑏)
𝐸 (𝑎 ) =
𝑁
(𝑏 + 𝑐 )(𝑎 + 𝑏)
𝐸 (𝑏 ) =
𝑁
(𝑎 + 𝑐 )(𝑐 + 𝑑 )
𝐸 (𝑐 ) =
𝑁
(𝑏 + 𝑑 )(𝑐 + 𝑑 )
𝐸 (𝑑 ) =
𝑁
(𝑂 − 𝐸)2
𝝌𝟐 𝑐𝑎𝑙 = ∑
𝐸
𝝌𝟐 𝑐𝑎𝑙 value to be compared with 𝝌𝟐 𝑡𝑎𝑏 value at 1% (5.1 or10%) level of significance for
(r-1) (c-1) d.f where r- number of rows
c-number of columns.
Note: In 𝝌𝟐 distribution for independence of attributes, we test if two attributes A and B are
independent or not.
i)Null Hypothesis: H0 : The two attributes are independent
ii) Alternative hypothesis: H1 : The two attributes are not independent
(O − E)2
iii) Test Statistic χ2 cal = ∑
E
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥
iv)At specified level of significance for (m-1) (n-1) d.f where m- no. of rows and n- no. of
columns
v)Conclusion : If 𝛘𝟐 cal value < 𝛘𝟐 tab value , then we accept H0 , Otherwise reject H0 .

Problems :
1. Fit a Poisson distribution to the following data and test for its goodness of fit at 5%
los
x 0 1 2 3 4
f 419 352 154 56 19
Sol:
X f fx
0 419 0
1 352 352
2 154 308
3 56 168
4 19 76
N=1000 ∑ 𝑓𝑥 = 904
∑ 𝑓𝑥 904
Mean 𝜆 = N
= 1000 = 0.904
Theoretical distribution is given by
𝑒−𝜆 𝜆𝑥
= N x p(x) = 1000 x 𝑥!
Hence the theoretical frequencies are given by
x 0 1 2 3 4 Total
f = 1000 x 406.2 366 165.4 49.8 12.6 1000
𝑒−𝜆𝜆𝑥
𝑥!
Since Given frequencies total is equal to Calculated frequencies total.
To test for goodness of fit:
i) H0 : There is no difference in given values and calculated values
ii) H1 : There is some difference in given values and calculated values
(𝑂−𝐸)2
iii)𝝌𝟐 𝑐𝑎𝑙 = ∑ 𝐸
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
419 406.2 (419 − 406.2)2 (419 − 406.2)2
406.2
352 366 (352 − 366)2 (352 − 366)2
366
154 165.4 (154 − 165.4)2 (154 − 165.4)2
165.4

56 49.8 (56 − 49.8)2 (56 − 49.8)2
49.8
19 12.6 (19 − 12.6)2 (19 − 12.6)2
12.6
(𝑂−𝐸) 2
∑ = 5.748
𝐸
Degrees of freedom = 5-2 = 3
𝝌𝟐 𝑡𝑎𝑏 at 5% LOS = 7.82
Since 𝝌𝟐 𝑐𝑎𝑙 value < 𝝌𝟐 𝑡𝑎𝑏 ,we accept 𝐻0 .
3. A die is thrown 264 times with following results. Show that the die is biased [ Given
𝝌𝟐 𝟎.𝟎𝟓 = 11.07 for 5 d.f]
No. appeared 1 2 3 4 5 6
on the die
Frequency 40 32 28 58 54 52
Sol: i) H0 : The die is unbiased
ii) H1 : The die is not unbiased
𝟐 (𝑂−𝐸)2
iii)𝛘 cal
=∑ 𝐸
264
The expected frequency of each of the number 1,2,3,4,5,6 is 6
= 44
Calculation of 𝝌𝟐 :
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
40 44 16 0.3636
32 44 144 3.2727
28 44 256 5.8181
58 44 196 4.4545
54 44 100 2.2727
52 44 64 1.4545
(𝑂−𝐸)2
∑ = 17.6362
𝐸
𝝌𝟐 𝑐𝑎𝑙 = 17.6362
The number of degrees of freedom = n-1 = 5
𝝌𝟐 0.05 = 11.07 for 5 d.f
Since 𝝌𝟐 𝑐𝑎𝑙 value > 𝝌𝟐 𝑡𝑎𝑏 value , we reject 𝐻0
Hence the die is biased

4. On the basis of information given below about the treatment of 200 patients
suffering from disease , state whether the new treatment is comparatively
Superior to the conventional treatment.
Treatment Favorable Not Favorable Total
New 60 30 90
Conventional 40 70 110
Sol: i) H0 : The two attributes are independent
ii) H1 : The two attributes are not independent
(𝑂 − 𝐸)2
iii) 𝛘𝟐 cal = ∑
𝐸
90 x 100 90 x 100 90
= 45 = 45
200 200
100x 110 100 x 110 11
= 55 = 55
200 200
100 100 200
𝐑𝐨𝐰 𝐭𝐨𝐭𝐚𝐥 𝐱 𝐂𝐨𝐥𝐮𝐦𝐧 𝐭𝐨𝐭𝐚𝐥
where E = 𝐆𝐫𝐚𝐧𝐝 𝐭𝐨𝐭𝐚𝐥
Calculation of 𝝌𝟐 :
O E (𝑂 − 𝐸)2 (𝑂 − 𝐸)2
𝐸
60 45 225 5
30 45 225 5
40 55 225 4.09
70 55 225 4.09
(𝑂−𝐸)2
∑ = 18.18
𝐸
𝝌𝟐 𝑐𝑎𝑙 = 18.18
𝝌𝟐 𝑡𝑎𝑏 for 1 d.f . at 5% los is 3.841
since 𝝌𝟐 𝑐𝑎𝑙 value > 𝝌𝟐 𝑡𝑎𝑏 value , we reject 𝐻0
Hence we conclude that new and conventional treatment are not independent.

Snedecor’s F- Test of Significance
The F-Distribution is also called as Variance Ratio Distribution as it usually defines the ratio
of the variances of the two normally distributed populations. The F-distribution got its name
after the name of R.A. Fisher, who studied this test for the first time in 1924.
Symbolically, the quantity is distributed as F-distribution with and degrees of freedom 𝜗1 =

𝑛1 − 1and 𝜗2 = 𝑛2 − 1 is represented as:
Greater Variance
Fcal =
Smaller Varinace
𝑆1 2 𝑆2 2
𝐹𝑐𝑎𝑙 = Or
𝑆2 2 𝑆1 2
Where,
𝑛1 𝑠1 2 1
𝑆1 2 is the unbiased estimator of σ12 and is calculated as: 𝑆1 2 = =𝑛 𝑥1 )2
∑(𝑥1 − ̅̅̅
𝑛1 −1 1−1
𝑛2 𝑠22 1
𝑆2 2 is the unbiased estimator of σ22 and is calculated as: 𝑆2 2 = 𝑛2 −1
=𝑛 𝑥 2 )2
∑(𝑥2 − ̅̅̅
2 −1
To test the hypothesis that the two population variances 𝝈𝟏 𝟐 and 𝝈𝟐 𝟐 are
equal
i) H0 : σ1 2 = σ2 2
ii) H1 : σ1 2 ≠ σ2 2
𝐆𝐫𝐞𝐚𝐭𝐞𝐫 𝐕𝐚𝐫𝐢𝐚𝐧𝐜𝐞
iii) Fcal = 𝐒𝐦𝐚𝐥𝐥𝐞𝐫 𝐕𝐚𝐫𝐢𝐧𝐚𝐜𝐞
iv)At specified level of significance ( 1% or 5 %) for (ϑ1 ,ϑ2 ) d.f
v) If 𝐅cal value < 𝐅tab value , then we accept H0 , Otherwise reject H0 .
𝐹𝑐𝑎𝑙 (𝜗1 , 𝜗2 ) is the value of F with 𝜗1 and 𝜗2 degrees of freedom such that the area under the
F – distribution to the right of 𝐹𝛼 is 𝛼.

Problems:
1. In one sample of 8 observations from a normal population, the sum of the squares of
deviations of the sample values from the sample mean is 84.4 and in another sample
of 10 observations it was 102.6. Test at 𝟓% level whether the populations have the
same varience.
Sol: Let 𝜎1 2 and 𝜎2 2 be the variances of the two normal populations from which the
samples are drawn.
Let the Null Hypothesis be 𝐻0 : 𝜎1 2 = 𝜎2 2
Then the Alternative Hypothesis is 𝐻1 : 𝜎1 2 ≠ 𝜎2 2
Here 𝑛1 = 8, 𝑛2 = 10
2
𝑥)2 = 84.4, ∑(𝑦𝑖 − 𝑦
Also ∑(𝑥𝑖 − ̅ ̅) = 102.6
If 𝑆1 2 𝑎𝑛𝑑𝑆2 2 be the estimates of 𝜎1 2 and 𝜎2 2 then
1 84.4
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 12.057
𝑛1 − 1 7
and
1 2 102.6
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 11.4
𝑛2 − 1 9
Let 𝐻0 be true. Since 𝑆1 2 > 𝑆2 2 , the test statistic is
𝑆 2 12.057
𝐹 = 𝑆1 2 = = 1.057
2 11.4
i.e., calculated F = 1.057.
Degrees of freedom are given by 𝑣1 = 𝑛1 − 1 = 8 − 1 = 7
and 𝑣2 = 𝑛2 − 1 = 10 − 1 = 9
Tabulated value of 𝐹 at 5% level for (7,9) degrees of freedom is 3.29
i.e.,𝐹0.05 (7,9) = 3.29
Since calculated 𝐹 < tabulated 𝐹, we accept the Null Hypothesis 𝐻0 and

conclude that the populations have the same variance.

2.The time taken by workers in performing a job by method I and method II is given
below
Method I 20 16 26 27 23 22 -
Method II 27 33 42 35 32 34 38
Do the data show that the variances of time distribution from population from which
these samples are drawn do not differ significantly?
Sol: Let the Null Hypothesis be 𝐻0 : 𝜎1 2 = 𝜎2 2 where 𝜎1 2 and 𝜎2 2 are the variances of the
two populations from with the samples are drawn.
The Alternative Hypothesis is 𝐻1 : 𝜎1 2 ≠ 𝜎2 2 .
Calculation of sample variances.
𝑥 𝑥 − 𝑥̅ ̅)2
(𝑥 − 𝑥 𝑦 𝑦 − 𝑦̅ ̅)2
(𝑦 − 𝑦
20 -2.3 5.29 27 -7.4 54.76
16 -6.3 39.69 33 -1.4 1.96
26 3.7 13.69 42 7.6 57.76
27 4.7 22.09 35 0.6 0.36
23 0.7 0.49 32 -2.4 5.76
22 -0.3 0.09 34 -0.4 0.16
38 3.6 12.96
134 81.34 241 133.72
𝑛1 = 6, 𝑛2 = 7
∑ 𝑥 134 ∑ 𝑦 241
∴ 𝑥̅ = = = 22.3, 𝑦̅ = = = 34.
𝑛1 6 𝑛2 67
2
̅)2 = 81.34, ∑(𝑦𝑖 − 𝑦
∑(𝑥𝑖 − 𝑥 ̅) = 133.72
If 𝑆1 2 𝑎𝑛𝑑𝑆2 2 be the estimates of 𝜎1 2 and 𝜎2 2 , then
1 81.34
𝑆1 2 = ̅)2 =
∑(𝑥𝑖 − 𝑥 = 16.26
𝑛1 − 1 5
and
1 2 133.72
𝑆2 2 = ∑(𝑦𝑖 − 𝑦
̅) = = 22.29
𝑛2 − 1 6
Let 𝐻0 be true
Since 𝑆2 2 > 𝑆1 2 , the statistic is
S2 2 22.29
F= 2 = 16.268 = 1.3699 = 1.37
S1
F0.05 (5,6) d. f = 4.39
Since calculated F < tabulated F , we accept the null hypothesis 𝐻0 at 5% los i.e., there is no
significant difference between the variances of the distribution by the workers.

TUTORIAL QUESTIONS
1. A random sample of 500 apples was taken from a large consignment and 60 were found
to be bad. Obtain 95% confidence interval for the percentage number of bad apples in
the consignment.
2. The average income of 100 people of a city is Rs 210 with a standard deviation of Rs
10.For another sample of 150 people the average income is Rs 220 with a standard
deviation of Rs 12.Test the significant difference between two mean at 5% LOS.
3. A coin is tossed 960 times .Head turned up 184 times. Find whether the coin is unbiased.
4. Random samples of 600 men and 900 women in a locality were asked they would like to
have a bus stop near their residence .350 men and 475 women were in favor of the
proposal. Test the significance between the difference of two proportions at 5%LOS.
5. .A pair of dice are thrown 360 times and the frequency of each sum is indicated below:
Sum 2 3 4 5 6 7 8 9 10 11 12
Frequency 8 24 35 37 44 65 51 42 26 14 14
Would you say that the dice are fair on the basis of the chi-square test at 5% LOS
6. The following are the average weekly losses of worker hours due to accidents in 10
industrial plant before and after a certain safety programme was put into operation:
Before 45 73 46 124 33 57 83 34 26 17
After 36 60 44 119 35 51 77 29 24 11
Test whether the safety programme is effective in reducing the number of accidents at 5%

ASSIGNMENT QUESTIONS
1. A random sample of 500 items has mean 20 and another sample of size 400 has mean 15.
Can you conclude that the two samples are taken from the same population with SD as 4.
2. A sample of 500 products are examined from a factory and 5% found to be defective.
Another sample of 400 similar products are examined and 3% found to be defective.
Test the significance between the difference of two proportions at 5% LOS.
3. 20 people were attacked by a disease and only 18 survived .will you reject the hypothesis
that the survival rate of the attack by this disease is 85% in favor of the hypothesis that is
more at 5% LOS
4. Ten specimens of copper wires drawn from a large lot have the following breaking
strength(in kg) 518,572,570,568,572,578,572,569,548.Test whether the mean breaking
strengths of the lot may be taken to be 518 kg weight.
5. 4.A survey of 320 families with 4children each revealed the following distribution
No# of boys 5 4 3 2 1 0
No# of girls 0 1 2 3 4 5
No# of families 14 56 110 88 40 12
Is this result consistent with the hypothesis that male and female births are equally
popular?

PSCV Unit-Iii Digital Notes

Uploaded by

Copyright:

Available Formats

PSCV Unit-Iii Digital Notes

Uploaded by

Document Information

Original Title

Copyright

Available Formats

Share this document

Share or Embed Document

Sharing Options

Did you find this document useful?

Is this content inappropriate?

Copyright:

Available Formats

PSCV Unit-Iii Digital Notes

Uploaded by

Copyright:

Available Formats

Confidence limits for the difference of two population proportions

 95% confidence limits are 𝑝1 -𝑝2 ± 1.96 ( S.E. of 𝑝1 -𝑝2 )

Determination of proper sample size

𝜎 − Standard deviation of population and

E – Maximum sampling Error = 𝑥̅ – 𝜇

Sample size for estimating population proportion :

Procedure for testing a hypothesis:

Test of Hypothesis involves the following steps:

Step1: Statement of hypothesis :

DEPARTMENT OF HUMANITIES & SCIENCES

Step 2: Specification of level of significance :

Step 3 : Identification of the test Statistic :

Step 4: Critical Region:

DEPARTMENT OF HUMANITIES & SCIENCES

Step 5 : Making decision:

If calculated value ≤ critical value , we accept 𝐻0 , otherwise reject 𝐻0 .

DEPARTMENT OF HUMANITIES & SCIENCES

Standard deviation of the population ,

(3 − 12)2 + (6 − 12)2 + (9 − 12)2 + (15 − 12)2 + (27 − 12)2

a) Sampling without replacement :

∴ 𝜎𝑥̅ = √13.3 = 3.651

a) The mean of the population

Sol: a) Mean of the Population is given by

DEPARTMENT OF HUMANITIES & SCIENCES

b) Variance of the population is given by

(2−6)2 +(3−6) 2+(6−6)2 +(8−6)2 +(11−6)2

c) Sampling with replacement

Sample size = n .let n= 𝑛1 =800

When 𝑛1 is reduced to 200

DEPARTMENT OF HUMANITIES & SCIENCES

Sol: n= The size of the sample =169

𝜎 = S.D of population = √Variance = √2

Sol: 𝜇 = Mean of the population

= Mean height of students of a college = 155cms

n = S.D of population = 15cms

̅𝑥 = mean of sample = 157 cms

∴ P ( 𝑥̅ ≤ 157) = P ( z < 0.8 ) = 0.5 + P ( 0 ≤ z ≤ 0.8 )

= 0.5 +0.2881 = 0.7881

Sol: Given 𝑥̅ = 21.6

𝑧𝛼⁄2 = 1.96, n = 100 , 𝜎 = 5.1

DEPARTMENT OF HUMANITIES & SCIENCES

Sol: Given n = the size of the sample = 64

𝜇 = the mean of the population = 51.4

𝜎 = the S.D of the population = 6.8

a) P( 𝑥̅ exceed 52.9 ) = P(𝑥̅ > 52.9)

= 0.5 – P(0 < z < 1.76)

b) P( 𝑥̅ fall between 50.5 and 52.3)

c) P( 𝑥̅ will be less than 50.6) = P(𝑥̅ < 50.6)

DEPARTMENT OF HUMANITIES & SCIENCES

Sample size , n =64

Sol : Let n be the size of the sample

The guaranteed mean is 1500

DEPARTMENT OF HUMANITIES & SCIENCES

Sol : Given 𝜇 = 0.1 , 𝜎 = 2.1 and n = 900

The Standard normal variate

Sol : Size of a random sample , n = 80

The mean of random sample , 𝑥̅ = Rs 472.36

The area when z = 1.43 from tables is 0.4236

∴ confidence = (1- 𝛼 ) 100% = 84.72 %